Mailing List Archive

Indexing performance benchmarks
I think that it would be really useful if users can post performance
benchmarks for usage of Lucene in their app. I know its been done informally
on an ad hoc basis by various people in the past, but I'd like to propose a
standardized format:

Number of source documents:
Total filesize of source documents:
Average filesize of source documents (in KB/MB):
Source documents storage location (filesystem, DB, http,etc):
File type of source documents:
Parser(s) used, if any:
Time taken (in ms/s as an average of at least 3 indexing runs):
Notes (any special tuning/strategies):

This will really help users know what performance to expect when indexing
and should help to raise warning flags when indexing times aren't similar
to benchmarks. Any one to start? :)

Regards,
Kelvin


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Indexing performance benchmarks [ In reply to ]
I like this idea too.

It would also be good to know about system used

Java Version
OS Version
CPU (Type, Speed and Quantity)
RAM
Drive configuration (IDE, SCSI, RAID-1, RAID-5)



On 5/2/02 11:47 PM, "Kelvin Tan" <kelvin@relevanz.com> wrote:

>
> Number of source documents:
> Total filesize of source documents:
> Average filesize of source documents (in KB/MB):
> Source documents storage location (filesystem, DB, http,etc):
> File type of source documents:
> Parser(s) used, if any:
> Time taken (in ms/s as an average of at least 3 indexing runs):
> Notes (any special tuning/strategies):


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: Indexing performance benchmarks [ In reply to ]
Excellent. Otis suggested number and type of fields as well. I'd really like
to consolidate these figures and stick them somewhere on the website if its
ok with the people who contribute. Thanks Peter for adding the stats on
hardware and others.

Here's an updated and sorted list:

<benchmark>
Hardware environment
Dedicated machine for indexing (yes/no):
CPU (Type, Speed and Quantity):
RAM:
Drive configuration (IDE, SCSI, RAID-1, RAID-5):

Software environment
Java Version:
OS Version:
Location of index directory (local/network):

Lucene indexing variables
Number of source documents:
Total filesize of source documents:
Average filesize of source documents (in KB/MB):
Source documents storage location (filesystem, DB, http,etc):
File type of source documents:
Parser(s) used, if any:
Analyzer(s) used:
Number of fields per document:
Type of fields:
Index persistence (FSDirectory, SqlDirectory, etc):

Time taken (in ms/s as an average of at least 3 indexing runs):
Time taken / 1000 docs indexed:
Memory consumption:

Notes (any special tuning/strategies):
</benchmark>

If you'd like to contribute these stats but wish to remain anonymous, that's
cool too. You can mail me offline or something, and your boss will never
know...:)

----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, May 03, 2002 9:21 PM
Subject: Re: Indexing performance benchmarks


> I like this idea too.
>
> It would also be good to know about system used
>
> Java Version
> OS Version
> CPU (Type, Speed and Quantity)
> RAM
> Drive configuration (IDE, SCSI, RAID-1, RAID-5)
>
>
>
> On 5/2/02 11:47 PM, "Kelvin Tan" <kelvin@relevanz.com> wrote:
>
> >
> > Number of source documents:
> > Total filesize of source documents:
> > Average filesize of source documents (in KB/MB):
> > Source documents storage location (filesystem, DB, http,etc):
> > File type of source documents:
> > Parser(s) used, if any:
> > Time taken (in ms/s as an average of at least 3 indexing runs):
> > Notes (any special tuning/strategies):
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>