Hi,
Let say I want to retrieve all relevant listings for a query (just
suppose)...
I have 4 million documents... I could:
Split these into 4 x 1 million document indexes and then send a
query to 4 Lucene processes ? At the end I would have to sort the
results by relevance.
Question for Doug or any other Search Engine guru -- would this
reduce the time to find these results by 75% ?
I know it is probably a hard question to answer (i.e. all the
documents that match, might just be in one process...) but I'm more
getting at the average length of the inverted indexes that have to be
joined being reduced by 75%, hence the join should take only 25% of
the time...
Any thoughts on this idiocy ? Reason why I ask ? Well, lets say I
can't fit a 4 million document RamDir index into 1GB heap space, but
I could if I split it up :) ?
Cheers,
Winton
Winton Davies
Lead Engineer, Overture (NSDQ: OVER)
1820 Gateway Drive, Suite 360
San Mateo, CA 94404
work: (650) 403-2259
cell: (650) 867-1598
http://www.overture.com/
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Let say I want to retrieve all relevant listings for a query (just
suppose)...
I have 4 million documents... I could:
Split these into 4 x 1 million document indexes and then send a
query to 4 Lucene processes ? At the end I would have to sort the
results by relevance.
Question for Doug or any other Search Engine guru -- would this
reduce the time to find these results by 75% ?
I know it is probably a hard question to answer (i.e. all the
documents that match, might just be in one process...) but I'm more
getting at the average length of the inverted indexes that have to be
joined being reduced by 75%, hence the join should take only 25% of
the time...
Any thoughts on this idiocy ? Reason why I ask ? Well, lets say I
can't fit a 4 million document RamDir index into 1GB heap space, but
I could if I split it up :) ?
Cheers,
Winton
Winton Davies
Lead Engineer, Overture (NSDQ: OVER)
1820 Gateway Drive, Suite 360
San Mateo, CA 94404
work: (650) 403-2259
cell: (650) 867-1598
http://www.overture.com/
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>