Mailing List Archive

deleting question
I did a batch deletion on an index. Then, after searching the archives for
something else, I came across this

> I understand there are three modes for using IndexReader and
> IndexWriter:
>
> A- IndexReader for reading only, not deleting
> B- IndexReader for deleting (and reading)
> C- IndexWriter (for adding and optimizing)

> What matters is that only one of B or C can be done at once.
> That's to say, only a single process/thread may
> modify an index at once. Modification should be single threaded.

In looking back at the code I ran, I had an IndexWriter open, and then I
opened an IndexReader, and did the deletions. Every 300,000 deletions, I
called the optimize method of the writer. When I was done with the
deletions, I closed the reader, then called optimize again on the writer,
then closed the writer.

My question is, does anyone know offhand if having both of those open at
once would have done anything bad (like corrupt) my index?

Thanks,

Dan


-----Original Message-----
From: Armbrust, Daniel C. [mailto:Armbrust.Daniel@mayo.edu]
Sent: Monday, May 20, 2002 5:06 PM
To: 'Lucene Users List'
Subject: RE: Lucene's scalability


In my experience the time it takes depends much more on the complexity of
the query, rather than the number of results returned. If I am making a
query with 50-60 terms, I am usually getting down to a couple thousand or
less results.

Dan


-----Original Message-----
From: CNew [mailto:cnew@fuse.net]
Sent: Monday, May 20, 2002 8:46 AM
To: Lucene Users List
Subject: Re: Lucene's scalability


you didn't mention the hit count on the query with 50-60 terms.
just wondering if the time was linear.

----- Original Message -----
From: Armbrust, Daniel C. <Armbrust.Daniel@mayo.edu>
To: 'Lucene Users List' <lucene-user@jakarta.apache.org>
Sent: Friday, May 17, 2002 6:57 AM
Subject: RE: Lucene's scalability


> Currently, we are using a Ultra-80 Sparc Solaris with 4 processors and 4
GB
> of Ram.
>
> However, we are only making use of one of those processors with the index.
> Our biggest speed restriction is the fact that our entire index resides on
a
> single disk drive. We have a raid array coming soon.
>
> The performance has been very impressive, but as you can imagine, the
speed
> depends highly on the complexity of the query. If you run a query with a
> 1/2 a dozen terms and fields, which returns ~30,000 results, it usually
> takes on the order of a second or two. If you run a query with 50-60
terms,
> it may take 5-6 seconds.
>
> I don't have any better performance stats than this currently.
>
> Dan
>
>
> -----Original Message-----
> From: Harpreet S Walia [mailto:harpreet@sansuisoftware.com]
> Sent: Friday, May 17, 2002 7:23 AM
> To: Lucene Users List
> Subject: Re: Lucene's scalability
>
>
> Hi ,
>
> I am also trying to do a similar thing . I am very eager to know what kind
> of hardware u are using to maintain such a big index.
> In my case it is very important that the search happens very fast . so
does
> such a big index of 10 gb pose any problems in this direction
>
> TIA
>
> Regards
> Harpreet
>
>
>
> ----- Original Message -----
> From: "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu>
> To: "'Lucene Users List'" <lucene-user@jakarta.apache.org>
> Sent: Tuesday, April 30, 2002 12:07 AM
> Subject: RE: Lucene's scalability
>
>
> > I currently have an index of ~ 12 million documents, which are each
about
> > that size (but in xml form).
> >
> > When they are transformed for lucene to index, there are upwards of 50
> > searchable fields.
> >
> > The index is about 10 GB right now.
> >
> > I have not yet had any problems with "pushing the limits" of lucene.
> >
> > In the next few weeks, I will be pushing my number of indexed documents
up
> > into the 15-20 million range. I can let you know if any problems arise.
> >
> > Dan
> >
> >
> >
> > -----Original Message-----
> > From: Joel Bernstein [mailto:j.bernstein@ei.org]
> > Sent: Monday, April 29, 2002 1:32 PM
> > To: lucene-user@jakarta.apache.org
> > Subject: Lucene's scalability
> >
> >
> > Is there a known limit to the number of documents that Lucene can handle
> > efficiently? I'm looking to index around 15 million, 2K docs which
> contain
> > 7-10 searchable fields. Should I be attempting this with Lucene?
> >
> > Thanks,
> >
> > Joel
> >
> >
> > --
> > To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>



--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>