Mailing List Archive

FW: Lucene 1.2 and directory write permissions?
Here's one vote for putting locks in a separate directory. Anyone dislike
that?

Doug

-----Original Message-----
From: Snyder, David [mailto:dsnyder@netgenics.com]
Sent: Friday, October 05, 2001 11:23 AM
To: Doug Cutting
Subject: RE: Lucene 1.2 and directory write permissions?


The lock file synchronization is very handy for us as we do updates in a
separate process from searches... I was very pleased to see this in there!

I think splitting out the locks into a separate directory would solve our
problem... we have a reference set of data that we use for testing and our
sysadmin wants to make sure it doesn't accidentally get overwritten. It's
still possible (using links and other permission magic) but becomes more of
a maintenance headache. I definitely vote for the locks subdirectory idea.
It can be created when the indexes are initially loaded, we can go change
the permissions or maybe make it a link to a tmp directory or something, and
then our regular index files can be safe. Do you think this is something
very difficult to do? (I have yet to build Lucene myself, but I would love
to contribute... we are actually working on some XML based loaders that may
be of general interest)

Lucene has been great for us, but the way... we are indexing genetic data
(not the sequences themselves, but all the annotations and description stuff
that scientists tack on) and Lucene has been excellent... our indexes (we
use many with the multisearcher) are about 13 gigs now and Lucene has hardly
broken a sweat.

Thanks for your help,
Dave

David Snyder
Señor Software Engineer

NetGenics, Inc.
1717 E. 9th St., #1700
Cleveland, OH 44114
(216) 861-4007


-----Original Message-----
From: Doug Cutting [mailto:DCutting@grandcentral.com]
Sent: Friday, October 05, 2001 12:24 PM
To: 'Snyder, David'
Subject: RE: Lucene 1.2 and directory write permissions?


> From: Snyder, David [mailto:dsnyder@netgenics.com]
>
> I've been porting our application to use the 1.2 release
> candidate 1 build
> and now have a problem opening searchers on our existing
> indexes. I get a
> Permission Denied exception... our permissions are set up to
> allow reading
> of the directory and contained files during a search, but not writing.

Hmm. That is a problem. The reader now creates a lock file while it is
opening the index to keep a writer process from deleting files while they're
being opened. When opening an index the reader must first read the list of
files to open, then open them. If between reading the list and opening a
file that file were to disappear, then the open would fail. This was the
longstanding race condition that is fixed by the lock files. A writing
process will now wait for the reader to open all of the files before
updating things.

Perhaps we should instead write lock files in a subdirectory of the index
named "locks". You could make that directory read/write, but make the
parent read-only. Alternately, we could have an flag that turns off the use
of lock files, for those who know that there is no other process that is
potentially simultaneously updating the index. Which approach would folks
prefer?

Doug
Re: FW: Lucene 1.2 and directory write permissions? [ In reply to ]
Just a though - as long as we are separating the locks, we should
probably make the location configurable and not require it to be a
subdirectory of the indexes. This will help in case the indexes come
burned on a CD-ROM or if the operating system does not support easy
linking / permission magic.

Doug Cutting wrote:

>Here's one vote for putting locks in a separate directory. Anyone dislike
>that?
>
>Doug
>
>-----Original Message-----
>From: Snyder, David [mailto:dsnyder@netgenics.com]
>Sent: Friday, October 05, 2001 11:23 AM
>To: Doug Cutting
>Subject: RE: Lucene 1.2 and directory write permissions?
>
>
>The lock file synchronization is very handy for us as we do updates in a
>separate process from searches... I was very pleased to see this in there!
>
>I think splitting out the locks into a separate directory would solve our
>problem... we have a reference set of data that we use for testing and our
>sysadmin wants to make sure it doesn't accidentally get overwritten. It's
>still possible (using links and other permission magic) but becomes more of
>a maintenance headache. I definitely vote for the locks subdirectory idea.
>It can be created when the indexes are initially loaded, we can go change
>the permissions or maybe make it a link to a tmp directory or something, and
>then our regular index files can be safe. Do you think this is something
>very difficult to do? (I have yet to build Lucene myself, but I would love
>to contribute... we are actually working on some XML based loaders that may
>be of general interest)
>
> Lucene has been great for us, but the way... we are indexing genetic data
>(not the sequences themselves, but all the annotations and description stuff
>that scientists tack on) and Lucene has been excellent... our indexes (we
>use many with the multisearcher) are about 13 gigs now and Lucene has hardly
>broken a sweat.
>
>Thanks for your help,
>Dave
>
>David Snyder
>Señor Software Engineer
>
>NetGenics, Inc.
>1717 E. 9th St., #1700
>Cleveland, OH 44114
>(216) 861-4007
>
>
>-----Original Message-----
>From: Doug Cutting [mailto:DCutting@grandcentral.com]
>Sent: Friday, October 05, 2001 12:24 PM
>To: 'Snyder, David'
>Subject: RE: Lucene 1.2 and directory write permissions?
>
>
>>From: Snyder, David [mailto:dsnyder@netgenics.com]
>>
>>I've been porting our application to use the 1.2 release
>>candidate 1 build
>>and now have a problem opening searchers on our existing
>>indexes. I get a
>>Permission Denied exception... our permissions are set up to
>>allow reading
>>of the directory and contained files during a search, but not writing.
>>
>
>Hmm. That is a problem. The reader now creates a lock file while it is
>opening the index to keep a writer process from deleting files while they're
>being opened. When opening an index the reader must first read the list of
>files to open, then open them. If between reading the list and opening a
>file that file were to disappear, then the open would fail. This was the
>longstanding race condition that is fixed by the lock files. A writing
>process will now wait for the reader to open all of the files before
>updating things.
>
>Perhaps we should instead write lock files in a subdirectory of the index
>named "locks". You could make that directory read/write, but make the
>parent read-only. Alternately, we could have an flag that turns off the use
>of lock files, for those who know that there is no other process that is
>potentially simultaneously updating the index. Which approach would folks
>prefer?
>
>Doug
>
Re: FW: Lucene 1.2 and directory write permissions? [ In reply to ]
Also, I think the flag to turn off process-safety is a good idea. This
should help to get some extra query performance in applications where
only one process accesses the index.
Re: FW: Lucene 1.2 and directory write permissions? [ In reply to ]
>Just a though - as long as we are separating the locks, we should
>probably make the location configurable and not require it to be a
>subdirectory of the indexes.

That's good if you're indexing to a NFS server with broken locking, too.


nelson@monkey.org
. . . . . . . . http://www.media.mit.edu/~nelson/