Mailing List Archive

XML Indexing Samples
I have put together a hopefully useful package that demonstrates our
current experiments with using Lucene for XML indexing. You can get the
files by anonymous ftp from che.isogen.com, /outgoing/lucene. There are
two zip files:

- lucene_xml_indexing.zip

This is the core indexing code and a little Java app that lets you do
searches and see the results (including going back to the original docs
to get data not stored in the index). There is documentation that should
get you going. Also includes Jython support for interacting with the
indexer and Lucene if you don't like GUIs (I wrote the Jython first and
the the GUI, if you're wondering).

- lucene_xml_sample_index.zip

This is a sample index containing three books from the New Testament
out of the Jon Bosak World Religions document set. I've included this
sample index because the index feature of the GUI may not work (it works
when I run the code from JBuilder, but didn't appear to work when I ran
it standalone, but I've run out of time to spend on this). The docids in
the index are absolute file paths, you need to put the "data" dir in the
Zip file at the root of the same drive you're running the GUI from. This
directory contains the original docs, which the GUI goes back to. Weak I
know but it's just a demo.

I haven't tested this stuff outside of Windows, but it should just work
elsewhere.

Let me know if there's some hideous problem with the package.

Cheers,

Eliot