Mailing List Archive

Slower search after 8.5.x to >=8.6
Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
search became slower(example TopScoreDocCollector became 20-30% slower,
from ElasticSearch - 50%).

While testing, I realized that it happened after LUCENE-9257(commit
e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
developer explicitly make this choice

Added a file that shows a simple demo that the search is slow
Need to run on commit e7a61ea and 90aced5, you will notice how the speed
drops to 30%
Slower search after 8.5.x to >=8.6 [ In reply to ]
Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
search became slower(example TopScoreDocCollector became 20-30% slower,
from ElasticSearch - 50%).

While testing, I realized that it happened after LUCENE-9257(commit
e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
developer explicitly make this choice

Added a file that shows a simple demo that the search is slow
Need to run on commit e7a61ea and 90aced5, you will notice how the speed
drops to 30%
Re: Slower search after 8.5.x to >=8.6 [ In reply to ]
Hello,

Why are you forcing NIOFSDirectory instead of using Lucene's defaults via
FSDirectory#open? I wonder if this might contribute to the slowdown you are
seeing given that access to the terms index tends to be a bit random.

It's very unlikely we'll add back a toggle for this as there is no point in
holding the terms index in JVM heap when it could live in the OS cache
instead.

On Thu, Apr 8, 2021 at 7:57 AM ?????? ???????? <mihaylovnikitos@gmail.com>
wrote:

> Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
> search became slower(example TopScoreDocCollector became 20-30% slower,
> from ElasticSearch - 50%).
>
> While testing, I realized that it happened after LUCENE-9257(commit
> e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
> developer explicitly make this choice
>
> Added a file that shows a simple demo that the search is slow
> Need to run on commit e7a61ea and 90aced5, you will notice how the speed
> drops to 30%
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



--
Adrien
Re: Slower search after 8.5.x to >=8.6 [ In reply to ]
Thanks for the answer
NIOFSDirectory is like an example. Degradation is also on
MMapDirectory and SimpleFSDirectory

We are using elasticseach and it has: simplefs (SimpleFsDirectory),
niofs (NIOFSDirectory), mmapfs (MMapDirectory) and hybridfs
(NIOFSDirectory + MMapDirectory). And for us, while niofs was a little
faster than other stores

Yes FSDirectory works fast(both commits), but now it is difficult to
test on prod elasticseach.
But why is FSDirectory fast? How to understand this?

??, 8 ???. 2021 ?. ? 13:49, Adrien Grand <jpountz@gmail.com>:
>
> Hello,
>
> Why are you forcing NIOFSDirectory instead of using Lucene's defaults via
> FSDirectory#open? I wonder if this might contribute to the slowdown you are
> seeing given that access to the terms index tends to be a bit random.
>
> It's very unlikely we'll add back a toggle for this as there is no point in
> holding the terms index in JVM heap when it could live in the OS cache
> instead.
>
> On Thu, Apr 8, 2021 at 7:57 AM ?????? ???????? <mihaylovnikitos@gmail.com>
> wrote:
>
> > Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
> > search became slower(example TopScoreDocCollector became 20-30% slower,
> > from ElasticSearch - 50%).
> >
> > While testing, I realized that it happened after LUCENE-9257(commit
> > e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
> > developer explicitly make this choice
> >
> > Added a file that shows a simple demo that the search is slow
> > Need to run on commit e7a61ea and 90aced5, you will notice how the speed
> > drops to 30%
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> --
> Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Slower search after 8.5.x to >=8.6 [ In reply to ]
FSDirectory#open is just a utility method that tries to pick the best
Directory implementation based on the platform, it's most likely
MMapDirectory for you, which is the directory implementation we use on all
64-bit platforms. So it's intriguing that you are seeing a slowdown with
MMapDirectory but not with FSDirectory#open. To my knowledge, Elasticsearch
is not doing anything special that could explain why MMapDirectory is slow
with Elasticsearch yet fast with Lucene.

Regardless of the Directory implementation, it's surprising that term
lookups be the bottleneck for query execution. It's usually more a
bottleneck when indexing with IndexWriter#updateDocument, which needs to
perform one ID lookup for every indexed document. I guess that the queries
that you are running match so few hits that very little time is spent
reading postings, is that correct? But then that would also means that your
queries are running very fast, likely in the order of a few millis? Or
maybe you have misconfigured your merge policy in a way that makes your
indices have so many segments that terms dictionary lookups may be a
bottleneck?

On Thu, Apr 8, 2021 at 1:40 PM ?????? ???????? <mihaylovnikitos@gmail.com>
wrote:

> Thanks for the answer
> NIOFSDirectory is like an example. Degradation is also on
> MMapDirectory and SimpleFSDirectory
>
> We are using elasticseach and it has: simplefs (SimpleFsDirectory),
> niofs (NIOFSDirectory), mmapfs (MMapDirectory) and hybridfs
> (NIOFSDirectory + MMapDirectory). And for us, while niofs was a little
> faster than other stores
>
> Yes FSDirectory works fast(both commits), but now it is difficult to
> test on prod elasticseach.
> But why is FSDirectory fast? How to understand this?
>
> ??, 8 ???. 2021 ?. ? 13:49, Adrien Grand <jpountz@gmail.com>:
> >
> > Hello,
> >
> > Why are you forcing NIOFSDirectory instead of using Lucene's defaults via
> > FSDirectory#open? I wonder if this might contribute to the slowdown you
> are
> > seeing given that access to the terms index tends to be a bit random.
> >
> > It's very unlikely we'll add back a toggle for this as there is no point
> in
> > holding the terms index in JVM heap when it could live in the OS cache
> > instead.
> >
> > On Thu, Apr 8, 2021 at 7:57 AM ?????? ???????? <
> mihaylovnikitos@gmail.com>
> > wrote:
> >
> > > Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
> > > search became slower(example TopScoreDocCollector became 20-30%
> slower,
> > > from ElasticSearch - 50%).
> > >
> > > While testing, I realized that it happened after LUCENE-9257(commit
> > > e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
> > > developer explicitly make this choice
> > >
> > > Added a file that shows a simple demo that the search is slow
> > > Need to run on commit e7a61ea and 90aced5, you will notice how the
> speed
> > > drops to 30%
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> > --
> > Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

--
Adrien
Re: Slower search after 8.5.x to >=8.6 [ In reply to ]
> It's usually more a bottleneck when indexing with IndexWriter # updateDocument, which needs to perform one ID lookup for every indexed document. I guess that the queries
that you are running match so few hits that very little time is spent
reading postings, is that correct?
> But then that would also means that your queries are running very fast, likely in the order of a few millis?

IndexWriter # updateDocumen - I did not notice any problems, but I
checked a little

Tried it on different searches and on different data. Yes, the fewer
hits the slower

> Or maybe you have misconfigured your merge policy in a way that makes your
indices have so many segments that terms dictionary lookups may be a
bottleneck?

The problem is reproducible with different numbers and sizes of
segments, but the smaller number of segments, the less degradation in
speed.
This behavior was before the upgrade

We don't have many segments. And the merge policy did not change

??, 8 ???. 2021 ?. ? 18:50, Adrien Grand <jpountz@gmail.com>:
>
> FSDirectory#open is just a utility method that tries to pick the best
> Directory implementation based on the platform, it's most likely
> MMapDirectory for you, which is the directory implementation we use on all
> 64-bit platforms. So it's intriguing that you are seeing a slowdown with
> MMapDirectory but not with FSDirectory#open. To my knowledge, Elasticsearch
> is not doing anything special that could explain why MMapDirectory is slow
> with Elasticsearch yet fast with Lucene.
>
> Regardless of the Directory implementation, it's surprising that term
> lookups be the bottleneck for query execution. It's usually more a
> bottleneck when indexing with IndexWriter#updateDocument, which needs to
> perform one ID lookup for every indexed document. I guess that the queries
> that you are running match so few hits that very little time is spent
> reading postings, is that correct? But then that would also means that your
> queries are running very fast, likely in the order of a few millis? Or
> maybe you have misconfigured your merge policy in a way that makes your
> indices have so many segments that terms dictionary lookups may be a
> bottleneck?
>
> On Thu, Apr 8, 2021 at 1:40 PM ?????? ???????? <mihaylovnikitos@gmail.com>
> wrote:
>
> > Thanks for the answer
> > NIOFSDirectory is like an example. Degradation is also on
> > MMapDirectory and SimpleFSDirectory
> >
> > We are using elasticseach and it has: simplefs (SimpleFsDirectory),
> > niofs (NIOFSDirectory), mmapfs (MMapDirectory) and hybridfs
> > (NIOFSDirectory + MMapDirectory). And for us, while niofs was a little
> > faster than other stores
> >
> > Yes FSDirectory works fast(both commits), but now it is difficult to
> > test on prod elasticseach.
> > But why is FSDirectory fast? How to understand this?
> >
> > ??, 8 ???. 2021 ?. ? 13:49, Adrien Grand <jpountz@gmail.com>:
> > >
> > > Hello,
> > >
> > > Why are you forcing NIOFSDirectory instead of using Lucene's defaults via
> > > FSDirectory#open? I wonder if this might contribute to the slowdown you
> > are
> > > seeing given that access to the terms index tends to be a bit random.
> > >
> > > It's very unlikely we'll add back a toggle for this as there is no point
> > in
> > > holding the terms index in JVM heap when it could live in the OS cache
> > > instead.
> > >
> > > On Thu, Apr 8, 2021 at 7:57 AM ?????? ???????? <
> > mihaylovnikitos@gmail.com>
> > > wrote:
> > >
> > > > Hi. I noticed that after the upgrade from Lucene8.5.x to Lucene >=8.6,
> > > > search became slower(example TopScoreDocCollector became 20-30%
> > slower,
> > > > from ElasticSearch - 50%).
> > > >
> > > > While testing, I realized that it happened after LUCENE-9257(commit
> > > > e7a61ea). Bug or feature? Can add settings for isOffHeep? To make the
> > > > developer explicitly make this choice
> > > >
> > > > Added a file that shows a simple demo that the search is slow
> > > > Need to run on commit e7a61ea and 90aced5, you will notice how the
> > speed
> > > > drops to 30%
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > >
> > > --
> > > Adrien
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> --
> Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org