Mailing List Archive: Approach for indexing and queryin good volume data.

Hi all,

I am planning to use Lucene(not in cluster) for indexing and querying good
volume data. Use case is, 10-20 documents / second(roughly around 15-20
fields) and in parallel doing query. Below is the approach i am planning to
take, can anyone please let me know from their past experience if that
sounds plausible and how to handle performance issues. Ideally i'll get
maximum throughput if index in batches, but latency would be issue, so
simple approach is index documents as they come.

-- Index documents as they come(single thread / blocking queue(1))
-- create document
-- add document
-- commit(this would be cost, but no option). I guess i can hold on the
commit, would Lucene do autocommit after certain point in time(can i define
any criteria ?) ?
-- DirectoryReader.openIfChanged((DirectoryReader) indexReader,
indexWriter) to open new reader before query to get new changes.

-- Also shall i close the indexWriter after every commit or keep it open for
the lifecycle of application(because application could keep n running for
months and months) ? Shall i create indexReader for every query and close or
just keep indexReader open for the lifecycle of application ?

Regards.

--
View this message in context: http://lucene.472066.n3.nabble.com/Approach-for-indexing-and-queryin-good-volume-data-tp4295110.html
Sent from the Lucene - General mailing list archive at Nabble.com.