OK, so you aren't going to get it into memory unless you spend a lot on
servers. We haven't found memory (or disk access) to be a limiting factor
anyway -- CPU is the issue. I'm not sure what you want to spend, but a
single server with SATA RAID, 4GB RAM and the latest AMD processor will
search your collection in ~10-20 seconds, depending on the complexity of the
search. If you need faster performance or the ability to support many hits
at once, you are going to have to parallelize the configuration across
multiple servers using ParallelMultiSearcher.
Keep in mind that Lucene isn't really set up to handle parallel searching
robustly. There is a lot of code you are going to have to write for an
enterprise-ready solution (e.g., checking the status of a given server to
make sure it isn't down, redundantly storing indexes so that the search
still functions if one server is down, potentially handling laggards to
increase speed, etc.).
We have done some of this, and have more to do -- it is a very non-trivial
task.
Sincerely,
James Ryley, Ph.D.
> -----Original Message-----
> From: caribou_surf [mailto:eric@mixad.com]
> Sent: Monday, August 28, 2006 10:42 AM
> To: general@lucene.apache.org
> Subject: RE: Kind of hardware config ?
>
>
> About 100 Giga
>
>
>
> James-10 wrote:
> >
> > What's the total document size?
> >
> > Sincerely,
> > James Ryley, Ph.D.
> >
> >> -----Original Message-----
> >> From: caribou_surf [mailto:eric@mixad.com]
> >> Sent: Monday, August 28, 2006 5:01 AM
> >> To: general@lucene.apache.org
> >> Subject: Kind of hardware config ?
> >>
> >>
> >> We want to index about 2 millions of html documents with Lucune.
> >> Have you an idea of the machine configuration the most adapted (bi
> proc,
> >> 2
> >> Go on memrory, raid disks...) ?
> >> --
> >> View this message in context: http://www.nabble.com/Kind-of-hardware-
> >> config---tf2176085.html#a6016661
> >> Sent from the Lucene - General forum at Nabble.com.
> >
> >
> >
>
> --
> View this message in context: http://www.nabble.com/Kind-of-hardware-
> config---tf2176085.html#a6021457
> Sent from the Lucene - General forum at Nabble.com.