Mailing List Archive

Filter and stop-words
I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".

My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?

Any help would be appreciated,

thanks a lot,

jk


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: Filter and stop-words [ In reply to ]
to remove plural form you have to create a stemmer for your language, i have
been working with porting a stemmer for norwegian for lucene, to get a head
start i have ported the norwegian snowball stemmer, there is one for
portuguese as well, check it out!

http://snowball.sourceforge.net/portuguese/stemmer.html

mvh karl øie


-----Original Message-----
From: Bizu de Anúncio [mailto:atendimento@bizudeanuncio.com]
Sent: 3. desember 2001 13:22
To: lucene-user@jakarta.apache.org
Subject: Filter and stop-words


I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".

My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?

Any help would be appreciated,

thanks a lot,

jk


--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>