Mailing List Archive

existing or not existing
Hi there,

I'm testing Lucene after reading a good article on it on JavaWorld.

Lucene seems quite simple and very powerful, but there's something I can't get.
The first time an application uses an index, this one doesn't exist yet, so the
boolean argument of the IndexWriter constructor must be true (creating a new
empty index). Next time the same app is started, I want to use the existing
index, the boolean argument must be false. Here is my question : how do I know
wether the index exists or not ?? Is there a way to create an IndexWriter on a
given index, creating it only if needed ?

It seems like a stupid question, I must have missed something...

Thanks


JCG



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
I think the 'create' flag really indicates whether it's ok
to *overwrite* the *possibly*existing* index.
Despite the tricky nuance it works great.

http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#IndexWriter(org.apache.lucene.store.Directory,%20org.apache.lucene.analysis.Analyzer,%20boolean)

jean-christian.gagne@prasahewitt.ch wrote:

> Hi there,
>
> I'm testing Lucene after reading a good article on it on JavaWorld.
>
> Lucene seems quite simple and very powerful, but there's something I can't get.
> The first time an application uses an index, this one doesn't exist yet, so the
> boolean argument of the IndexWriter constructor must be true (creating a new
> empty index). Next time the same app is started, I want to use the existing
> index, the boolean argument must be false. Here is my question : how do I know
> wether the index exists or not ?? Is there a way to create an IndexWriter on a
> given index, creating it only if needed ?
>
> It seems like a stupid question, I must have missed something...
>
> Thanks
>
> JCG
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
http://www.tropo.com/



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
Thank you for your quick answer.

I agree that create=true indicates that it's ok to overwrite. But when
create=false and the index does not exists, I get a FileNotFoundException.

I expected something like java.io.FileOutputStream 'append' flag :
false = overwrite
true = If the file exists, use it. If not, create an empty one

The choice of the constructor does not depend on "Does that file exist ?" but
rather on "Do I overwrite a possibly existing file ?".


JCG



|--------+------------------------------->
| | David Spencer |
| | <dave@tropo.com> |
| | |
| | 04.12.2001 16:19 |
| | Please respond to |
| | "Lucene Users List" |
| | |
|--------+------------------------------->
>-------------------------------------------------------------------------|
| |
| To: Lucene Users List <lucene-user@jakarta.apache.org> |
| cc: |
| Subject: Re: existing or not existing |
>-------------------------------------------------------------------------|





I think the 'create' flag really indicates whether it's ok
to *overwrite* the *possibly*existing* index.
Despite the tricky nuance it works great.





--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
You could try looking for a segments file in the index directory.
If it exists, the index exists, else it does not.

Is there a better way?

Otis

--- jean-christian.gagne@prasahewitt.ch wrote:
>
>
> Hi there,
>
> I'm testing Lucene after reading a good article on it on JavaWorld.
>
> Lucene seems quite simple and very powerful, but there's something I
> can't get.
> The first time an application uses an index, this one doesn't exist
> yet, so the
> boolean argument of the IndexWriter constructor must be true
> (creating a new
> empty index). Next time the same app is started, I want to use the
> existing
> index, the boolean argument must be false. Here is my question : how
> do I know
> wether the index exists or not ?? Is there a way to create an
> IndexWriter on a
> given index, creating it only if needed ?
>
> It seems like a stupid question, I must have missed something...
>
> Thanks
>
>
> JCG
>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>


__________________________________________________
Do You Yahoo!?
Buy the perfect holiday gifts at Yahoo! Shopping.
http://shopping.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
Personally, I like to know when I'm creating an index
so tend to have different code paths, or even different
classes, for reading and writing indexes. But how
about creating the IndexWriter with flag set to false
and catching the exception it will no doubt throw
if the index doesn't exist, and reissuing the
call with the flag set to true?



--
Ian.
ian.lea@blackwell.co.uk


Otis Gospodnetic wrote:
>
> You could try looking for a segments file in the index directory.
> If it exists, the index exists, else it does not.
>
> Is there a better way?
>
> Otis
>
> --- jean-christian.gagne@prasahewitt.ch wrote:
> >
> >
> > Hi there,
> >
> > I'm testing Lucene after reading a good article on it on JavaWorld.
> >
> > Lucene seems quite simple and very powerful, but there's something I
> > can't get.
> > The first time an application uses an index, this one doesn't exist
> > yet, so the
> > boolean argument of the IndexWriter constructor must be true
> > (creating a new
> > empty index). Next time the same app is started, I want to use the
> > existing
> > index, the boolean argument must be false. Here is my question : how
> > do I know
> > wether the index exists or not ?? Is there a way to create an
> > IndexWriter on a
> > given index, creating it only if needed ?
> >
> > It seems like a stupid question, I must have missed something...
> >
> > Thanks

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: existing or not existing [ In reply to ]
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Tuesday, December 04, 2001 5:46 PM
> To: Lucene Users List
> Subject: Re: existing or not existing
>
>
> You could try looking for a segments file in the index directory.
> If it exists, the index exists, else it does not.
>
> Is there a better way?
>
jive forum uses this method


> Otis
peter
Re: existing or not existing [ In reply to ]
Ian Lea wrote:
>
> Personally, I like to know when I'm creating an index
> so tend to have different code paths, or even different
> classes, for reading and writing indexes. But how
> about creating the IndexWriter with flag set to false
> and catching the exception it will no doubt throw
> if the index doesn't exist, and reissuing the
> call with the flag set to true?

This is what I do, but frankly it's a little scary.

Basically, if your open fails for some other reason, it
blows away your index.

In my application that's acceptable since my index is
pretty transient, but it's a bit iffy in a lot of situations.

I think it would make sense to follow the "standard"
of, if that flag is true, to make the index if needed and
use it if it exists.

The "false" flag could be used to trap exceptions if
the index doesn't exist, and allow the user to perform
other initializations.

To simply clear the index completely should be another
mechanism that the user would call if they really wanted
to overwrite an index with a new one.

--
Trevor Boicey, P. Eng.
Ottawa, Canada, tboicey@brit.ca
ICQ #17432933 http://www.brit.ca/~tboicey/
"There will be no aggressive condiment passing in this house!" - Marge

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
> This is what I do, but frankly it's a little scary.
>
> Basically, if your open fails for some other reason, it
>blows away your index.
>
>
> I think it would make sense to follow the "standard"
>of, if that flag is true, to make the index if needed and
>use it if it exists.
>
> To simply clear the index completely should be another
>mechanism that the user would call if they really wanted
>to overwrite an index with a new one.


I agree with you.

To make sure that I don't overwrite an existing index just because the open
failed, I won't rely on the exception but rather
check for the presence of the segment file, as suggested in another reply.

JCG



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: existing or not existing [ In reply to ]
Otis,

>You could try looking for a segments file in the index directory.
>If it exists, the index exists, else it does not.
>
>Is there a better way?

Simply create an indexreader and catch the IOException to handle the non
existing case.

Ype.


>Otis
>
>--- jean-christian.gagne@prasahewitt.ch wrote:
>>
>>
>> Hi there,
>>
>> I'm testing Lucene after reading a good article on it on JavaWorld.
>>
>> Lucene seems quite simple and very powerful, but there's something I
>> can't get.
>> The first time an application uses an index, this one doesn't exist
>> yet, so the
>> boolean argument of the IndexWriter constructor must be true
>> (creating a new
>> empty index). Next time the same app is started, I want to use the
>> existing
>> index, the boolean argument must be false. Here is my question : how
>> do I know
>> wether the index exists or not ?? Is there a way to create an
>> IndexWriter on a
>> given index, creating it only if needed ?
>>
>> It seems like a stupid question, I must have missed something...
>>
>> Thanks
>>
>>
>> JCG
>>
>>
>>
>> --
>> To unsubscribe, e-mail:
>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>> For additional commands, e-mail:
>> <mailto:lucene-user-help@jakarta.apache.org>
>>
>
>
>__________________________________________________
>Do You Yahoo!?
>Buy the perfect holiday gifts at Yahoo! Shopping.
>http://shopping.yahoo.com
>
>--
>To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: existing or not existing [ In reply to ]
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>
> You could try looking for a segments file in the index directory.
> If it exists, the index exists, else it does not.
>
> Is there a better way?

I think that's currently the best way. But it's not great, because it
requires applications to know something about the internal structure of the
index.

Going forward, I'm hesitant to change the semantics of the 'create' flag.
I'm also hesitant to add another flag or constructor method.

Perhaps the addition of the following IndexReader methods would suffice:

/** Returns true iff an index exists in the named directory. */
public static boolean indexExists(String directory);
public static boolean indexExists(File directory);
public static boolean indexExists(Directory directory);

These are analogous to the 'lastModified' methods. Internally these would
just check for the existence of the segments file.

Does that sound like a good plan?

Another place that currently requires application knowledge of index
structure is failure recovery. Currently if an indexing application crashes
it may leave .lock files in the directory which must be removed before the
index can be altered again. Perhaps this can be resolved similarly by
adding methods like:

/** Returns true iff the index in the named directory is currently
locked.*/
public static boolean isLocked(Directory directory);

/** Forcibly unlocks the index in the named directory.
* Caution: this should only be used by failure recovery code,
* when it is known that no other process or thread is in fact
* currently accessing this index.
*/
public static void unlock(Directory directory);

We could also have String and File versions for convenience.

Would folks use something like this? If so, more fodder for the TODO list!

Doug

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: existing or not existing [ In reply to ]
Yes, I would use this, especially the IndexReader methods that you
suggested.

Otis

--- Doug Cutting <DCutting@grandcentral.com> wrote:
> > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> >
> > You could try looking for a segments file in the index directory.
> > If it exists, the index exists, else it does not.
> >
> > Is there a better way?
>
> I think that's currently the best way. But it's not great, because
> it
> requires applications to know something about the internal structure
> of the
> index.
>
> Going forward, I'm hesitant to change the semantics of the 'create'
> flag.
> I'm also hesitant to add another flag or constructor method.
>
> Perhaps the addition of the following IndexReader methods would
> suffice:
>
> /** Returns true iff an index exists in the named directory. */
> public static boolean indexExists(String directory);
> public static boolean indexExists(File directory);
> public static boolean indexExists(Directory directory);
>
> These are analogous to the 'lastModified' methods. Internally these
> would
> just check for the existence of the segments file.
>
> Does that sound like a good plan?
>
> Another place that currently requires application knowledge of index
> structure is failure recovery. Currently if an indexing application
> crashes
> it may leave .lock files in the directory which must be removed
> before the
> index can be altered again. Perhaps this can be resolved similarly
> by
> adding methods like:
>
> /** Returns true iff the index in the named directory is currently
> locked.*/
> public static boolean isLocked(Directory directory);
>
> /** Forcibly unlocks the index in the named directory.
> * Caution: this should only be used by failure recovery code,
> * when it is known that no other process or thread is in fact
> * currently accessing this index.
> */
> public static void unlock(Directory directory);
>
> We could also have String and File versions for convenience.
>
> Would folks use something like this? If so, more fodder for the TODO
> list!
>
> Doug
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>


__________________________________________________
Do You Yahoo!?
Send your FREE holiday greetings online!
http://greetings.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
RE: existing or not existing [ In reply to ]
|--------+------------------------------->
| | Doug Cutting |
| | <DCutting@grandcentra|
| | l.com> |
| | |
| | 05.12.2001 18:21 |
| | Please respond to |
| | "Lucene Users List" |
| | |
|--------+------------------------------->
>-------------------------------------------------------------------------|
| |
| To: 'Lucene Users List' <lucene-user@jakarta.apache.org> |
| cc: |
| Subject: RE: existing or not existing |
>-------------------------------------------------------------------------|










> Would folks use something like this? If so, more fodder for the TODO list!


I would use the new methods of IndexReader, and I have one more suggestion :

public boolean isOptimized()


JCG




--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>