Mailing List Archive

Indexing a Database && Spanish
Hi!
I've tried Lucene and it looks very good, but I've two questions:

-I've tried a sample that index a Web Site (all the html files) but, now I
would like to mix in the same index, information from a directory and
information from a database. Is it possible??? Is there a DatabaseDocument
like a HTMLDocument??? Does anyone have a sample? Does anyone tried?

-I would like to index spanish information, is it optimized with the
StandardAnalyzer?? Have I to create an org.apache.lucene.analysis.es?? like
org.apache.lucene.analysis.de for german???? Is there anyone in spanish?



Thanks in advanced!




Jaume Homs | Technical Leader

Lost Boys Spain | Diputació, 246 | 08007 Barcelona España | www.lostboys.es
jaume.homs@lostboys.es | ICQ: 97362231 | Tel: +34 93 4457200 | Fax: +34 93
4457220

Amsterdam | Barcelona | Berlin | London | Madrid | Paris | San Francisco |
Warsaw | Zurich




----------------------------------- //
La información transmitida en este mensaje es CONFIDENCIAL y está dirigida
únicamente al destinatario del mensaje arriba indicado. No puede ser
transmitida ni revelada a persona distinta del destinatario sin autorización
expresa del remitente.
La recepción de este mensaje, por cualquier causa, por persona distinta al
destinatario del mismo no le autoriza para hacer uso alguno de la
información. En este caso queda prohibida, y puede ser ilícita, cualquier
divulgación, copia, distribución y/o cualquier uso de la información. Le
rogamos borre inmediatamente el mensaje de su sistema y, en su caso, todas
las copias del mismo y lo notifique al remitente. Gracias.



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Indexing a Database && Spanish [ In reply to ]
> ...
> -I've tried a sample that index a Web Site (all the html files) but, now I
> would like to mix in the same index, information from a directory and
> information from a database. Is it possible??? Is there a DatabaseDocument
> like a HTMLDocument??? Does anyone have a sample? Does anyone tried?

It is possible. Lucene neither knows nor cares where the information
comes from in the first place.

How about
Document htmld = getDocFromHtml();
writer.addDocument(htmld);
Document dbd = getDocumentFromDB();
writer.addDocument(dbd);

where getDocumentFromDB() will read whatever info you want
from your database and load it into a Lucene Document.


> -I would like to index spanish information, is it optimized with the
> StandardAnalyzer?? Have I to create an org.apache.lucene.analysis.es?? like
> org.apache.lucene.analysis.de for german???? Is there anyone in spanish?

StandardAnalyzer used StandardTokenizer and the javadocs for that say
"This should be a good tokenizer for most European-language documents"
but I've no personal experience of using if for any languages other
than English.



--
Ian.
ian.lea@blackwell.co.uk

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>