Mailing List Archive

Problems with index updates
Hi

I've got an index with about 1.5 million documents indexed. To make
the index available to my web applications, I've put up a tomcat 4.0
server with a couple of jsp pages doing the job of querying the index
and return the results. the searchload is quite low/moderate.

The search seems to work fine; the real problem is the update of the
index (which is the same as the "live" index). Every 10-15 minutes a
few hundred articles are added to the index, but this process seems to
be very unstable; in particular, the commit.lock file seems to cause
problems when opening or closing he index.

I've implemented the update-program as a separate process running
continously, sleeping for about 10 minutes before each new update.
Thus the update-cycle is something like


IndexWriter writer ;

while (true) {

// - fetch content to be indexed (works fine)

// - get index (problem 1)
try {
writer = new IndexWriter("/home/search/nbindex", new
StandardAnalyzer(), false);
} catch (Exception e) {
System.out.println("Exception when opening index. Wait for commit.lock
: " + e);
// Couldn't initialize writer; commit.lock exists
}

// ... if writer initialization went OK, update the index


// ... if a couple of hours have gone since last time, optimize


// close index - seems to cause problems sometimes too
writer.close() ;

// Sleep for 10 minutes

}

Now, I've tried to explicitly delete commit.lock in the program loop
(just to test it..) - that wasn't a good idea, since forcing index
updates several times without all of them being able to complete,
damages the index.

Does anyone have any suggestions on what I'm doing wrong?



Thanks,

Kent Vilhelmsen






--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Problems with index updates [ In reply to ]
Kent,

I can't tell from this code and without exception stack traces with
line numbers...
Looks fine to me.
I think I added some methods to IndexWriter that allow you to check if
the index is locked, maybe that can help you write more robust code...
Nice service there. I created a service much like Newsbooster a few
years back called InfoJump. Memories...

Otis

--- Kent Vilhelmsen <kentv@newsbooster.com> wrote:
> Hi
>
> I've got an index with about 1.5 million documents indexed. To make
> the index available to my web applications, I've put up a tomcat 4.0
> server with a couple of jsp pages doing the job of querying the index
> and return the results. the searchload is quite low/moderate.
>
> The search seems to work fine; the real problem is the update of the
> index (which is the same as the "live" index). Every 10-15 minutes a
> few hundred articles are added to the index, but this process seems
> to
> be very unstable; in particular, the commit.lock file seems to cause
> problems when opening or closing he index.
>
> I've implemented the update-program as a separate process running
> continously, sleeping for about 10 minutes before each new update.
> Thus the update-cycle is something like
>
>
> IndexWriter writer ;
>
> while (true) {
>
> // - fetch content to be indexed (works fine)
>
> // - get index (problem 1)
> try {
> writer = new IndexWriter("/home/search/nbindex", new
> StandardAnalyzer(), false);
> } catch (Exception e) {
> System.out.println("Exception when opening index. Wait for
> commit.lock
> : " + e);
> // Couldn't initialize writer; commit.lock exists
> }
>
> // ... if writer initialization went OK, update the index
>
>
> // ... if a couple of hours have gone since last time, optimize
>
>
>
> // close index - seems to cause problems sometimes too
> writer.close() ;
>
> // Sleep for 10 minutes
>
> }
>
> Now, I've tried to explicitly delete commit.lock in the program loop
> (just to test it..) - that wasn't a good idea, since forcing index
> updates several times without all of them being able to complete,
> damages the index.
>
> Does anyone have any suggestions on what I'm doing wrong?
>
>
>
> Thanks,
>
> Kent Vilhelmsen
>
>
>
>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>


__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>