Mailing List Archive

Lucene vs Solr Indexing Speed on Sample data Issue!!!
Hello Everyone,

I had posted a question on stackoverflow.com after performing a few POCs

My hadrware consist of a single i-3 intel processor (4 CPU as per "dxdiag"
on run ), 8GB Ram, Laptop machine.

My Question Link :
http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data

but no one could solve it as of now..
I hope the question I posted is undertandable.

Please if anyone could help me out with the indexing speed of Solr (way
slower) vs Lucene (way faster)..

I am trying to build a module for real time indexing and querying, and the
traffic is high, POC pass with Lucene for handling High Traffic for
Indexing, for Solr It is not able to do so..

Again My Machine Spec :
HP, intel core i3, 8GB ram, TB HDD.

Please let me know if there is a problem with Solr or am I doing anything
wrong.

Thanks
Argho
Re: Lucene vs Solr Indexing Speed on Sample data Issue!!! [ In reply to ]
Please post the original question here, so that everything people need
to review your question is included within this thread!

Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much. A
Lucene index, whether inside Solr or not, benefits from a lot of RAM.

Thanks!

Upayavira

On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> Hello Everyone,
>
> I had posted a question on stackoverflow.com after performing a few POCs
>
> My hadrware consist of a single i-3 intel processor (4 CPU as per
> "dxdiag"
> on run ), 8GB Ram, Laptop machine.
>
> My Question Link :
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data
>
> but no one could solve it as of now..
> I hope the question I posted is undertandable.
>
> Please if anyone could help me out with the indexing speed of Solr (way
> slower) vs Lucene (way faster)..
>
> I am trying to build a module for real time indexing and querying, and
> the
> traffic is high, POC pass with Lucene for handling High Traffic for
> Indexing, for Solr It is not able to do so..
>
> Again My Machine Spec :
> HP, intel core i3, 8GB ram, TB HDD.
>
> Please let me know if there is a problem with Solr or am I doing anything
> wrong.
>
> Thanks
> Argho
Re: Lucene vs Solr Indexing Speed on Sample data Issue!!! [ In reply to ]
And what does high throughput actually mean in terms of number of documents
per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <uv@odoko.co.uk> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much. A
> Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying, and
> > the
> > traffic is high, POC pass with Lucene for handling High Traffic for
> > Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing anything
> > wrong.
> >
> > Thanks
> > Argho
>
RE: Lucene vs Solr Indexing Speed on Sample data Issue!!! [ In reply to ]
From my experience, "high throughput" example:

Using single-thread SolrJ client, I can index (for example) 1000 documents per second. And this is maximum "speed".
Using 12 Threads, I can index 12000 documents per second, just because we have 8-core SOLR, and 75% of processing is CPU-bound.


You can do it with SOLR + SolrJ easily; with Lucene you will need much more development efforts, but it is the same.



Thanks,


http://www.tokenizer.ca

-----Original Message-----
From: Ted Dunning [mailto:ted.dunning@gmail.com]
Sent: June-15-15 3:17 PM
To: general@lucene.apache.org
Subject: Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

And what does high throughput actually mean in terms of number of documents per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <uv@odoko.co.uk> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much.
> A Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few
> > POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-s
> peed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr
> > (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying,
> > and the traffic is high, POC pass with Lucene for handling High
> > Traffic for Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing
> > anything wrong.
> >
> > Thanks
> > Argho
>
RE: Lucene vs Solr Indexing Speed on Sample data Issue!!! [ In reply to ]
In general, "out-of-the-box", pre-configured SOLR is slower than not-configured-at-all Lucene.

From another viewpoint, single-threaded HTTP access is I/O bound, and there is network roundtrip 50ms before SOLR spends 5 nanoseconds to index document. Using 128 parallel threads at the client side and fine-tuning Tomcat will help.




-----Original Message-----
From: Fuad Efendi [mailto:fuad@efendi.ca]
Sent: June-15-15 4:18 PM
To: 'general@lucene.apache.org'
Subject: RE: Lucene vs Solr Indexing Speed on Sample data Issue!!!

From my experience, "high throughput" example:

Using single-thread SolrJ client, I can index (for example) 1000 documents per second. And this is maximum "speed".
Using 12 Threads, I can index 12000 documents per second, just because we have 8-core SOLR, and 75% of processing is CPU-bound.


You can do it with SOLR + SolrJ easily; with Lucene you will need much more development efforts, but it is the same.



Thanks,


http://www.tokenizer.ca

-----Original Message-----
From: Ted Dunning [mailto:ted.dunning@gmail.com]
Sent: June-15-15 3:17 PM
To: general@lucene.apache.org
Subject: Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

And what does high throughput actually mean in terms of number of documents per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <uv@odoko.co.uk> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much.
> A Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few
> > POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-s
> peed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr
> > (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying,
> > and the traffic is high, POC pass with Lucene for handling High
> > Traffic for Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing
> > anything wrong.
> >
> > Thanks
> > Argho
>