Mailing List Archive

Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)
Hello,

recently I updated the Lucene version in one of our products from 8.3 to 8.8.x (8.8.2 as of now).
The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish.

So I took a closer look at one test-case which showed a severe slowdown
(it’s doing small update, flush, search cycles in order to stress NRT; the purpose is to see performance-changes in an early stage ???? ):

Lucene 8.3: ~2,3s
Lucene 8.8.x: 25s

This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and do a comparison.

The gap is caused by different amount of calls to sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) WindowsNativeDispatcher.java (native)
8.3: about 150 calls
8.8: about 12500 calls

In order to hunt down what is causing this, I took a look at the open() in NRTDirectory.
Here I could see that the amount of calls to that open is in the same ballpark for 8.3 and 8.8

The difference is that in 8.3 nearly all files are available in the underlying RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist.
This leads to a call to the WindowsNativeDispatcher.CreateFile0

Add the end of the mail I added two example-stacktraces that show this behavior.

Has someone an idea what change might cause this or if I need to do something different in 8.8 compared to 8.3?


Thanks for any help,

Markus

Here is an example stacktrace that is causing such a try of a read-access to non-existing file:

Filename= _0.fdm (IOContext is READ) (I checked the directory on harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir)

openInput:100, FilterDirectory (org.apache.lucene.store)
openInput:100, FilterDirectory (org.apache.lucene.store)
openChecksumInput:157, Directory (org.apache.lucene.store)
finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing)
flush:81, StoredFieldsConsumer (org.apache.lucene.index)
flush:239, DefaultIndexingChain (org.apache.lucene.index)
flush:350, DocumentsWriterPerThread (org.apache.lucene.index)
doFlush:476, DocumentsWriter (org.apache.lucene.index)
flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
getReader:605, IndexWriter (org.apache.lucene.index)
doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index)
openIfChanged:235, DirectoryReader (org.apache.lucene.index)


In a consequence later accesses to such files also lead to the state that the file is not within the RAMDirectory but only on harddisk.
Example:



Filename _1.fdx Context = READ (file is on harddisk but not in RAMDirectory)



openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openChecksumInput:157, Directory (org.apache.lucene.store)

write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)

createCompoundFile:5316, IndexWriter (org.apache.lucene.index)

sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)

flush:395, DocumentsWriterPerThread (org.apache.lucene.index)

doFlush:476, DocumentsWriter (org.apache.lucene.index)

flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)

getReader:605, IndexWriter (org.apache.lucene.index)

doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)

doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)

openIfChanged:235, DirectoryReader (org.apache.lucene.index)

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz Streibich - http://www.softwareag.com
Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) [ In reply to ]
> The update showed no issues (e.g. compiled without changes) but I noticed
that our test-suites take a lot longer to finish.

Hmm, that sounds bad. We need our tests to stay fast but also do a good
job testing things ;)

Does your production usage also slow down? Tests do other interesting
things, e.g. enable assertions, randomly swap in different codecs,
different Directory impls (if you are using Lucene's randomized test
infrastructure
<http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>),
etc.

> While in 8.8 files are opened for reading that do not (yet) exist.

That is incredibly strange! Lucene should never do that (opening a
non-existing file for "read", causing that file to become created through
Windows CreateFile), that I am aware of. Can you share the full source
code of the test case?

Try to eliminate parts of the test maybe? Do you see the same slowdown if
you don't use NRTCachingDirectory at all? (Just straight to
MMapDirectory.) There were some recent fixes to NRTCachineDirectory, e.g.
https://issues.apache.org/jira/browse/LUCENE-9115 and
https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are related?

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus
<Markus.Gietzen@softwareag.com.invalid> wrote:

> Hello,
>
> recently I updated the Lucene version in one of our products from 8.3 to
> 8.8.x (8.8.2 as of now).
> The update showed no issues (e.g. compiled without changes) but I noticed
> that our test-suites take a lot longer to finish.
>
> So I took a closer look at one test-case which showed a severe slowdown
> (it’s doing small update, flush, search cycles in order to stress NRT;
> the purpose is to see performance-changes in an early stage ???? ):
>
> Lucene 8.3: ~2,3s
> Lucene 8.8.x: 25s
>
> This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8
> and do a comparison.
>
> The gap is caused by different amount of calls to
> sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int,
> int) WindowsNativeDispatcher.java (native)
> 8.3: about 150 calls
> 8.8: about 12500 calls
>
> In order to hunt down what is causing this, I took a look at the open() in
> NRTDirectory.
> Here I could see that the amount of calls to that open is in the same
> ballpark for 8.3 and 8.8
>
> The difference is that in 8.3 nearly all files are available in the
> underlying RAMDirectory. While in 8.8 files are opened for reading that do
> not (yet) exist.
> This leads to a call to the WindowsNativeDispatcher.CreateFile0
>
> Add the end of the mail I added two example-stacktraces that show this
> behavior.
>
> Has someone an idea what change might cause this or if I need to do
> something different in 8.8 compared to 8.3?
>
>
> Thanks for any help,
>
> Markus
>
> Here is an example stacktrace that is causing such a try of a read-access
> to non-existing file:
>
> Filename= _0.fdm (IOContext is READ) (I checked the directory on
> harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
> openInput:100, FilterDirectory (org.apache.lucene.store)
> openChecksumInput:157, Directory (org.apache.lucene.store)
> finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
> finish:480, CompressingStoredFieldsWriter
> (org.apache.lucene.codecs.compressing)
> flush:81, StoredFieldsConsumer (org.apache.lucene.index)
> flush:239, DefaultIndexingChain (org.apache.lucene.index)
> flush:350, DocumentsWriterPerThread (org.apache.lucene.index)
> doFlush:476, DocumentsWriter (org.apache.lucene.index)
> flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
> getReader:605, IndexWriter (org.apache.lucene.index)
> doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index)
> openIfChanged:235, DirectoryReader (org.apache.lucene.index)
>
>
> In a consequence later accesses to such files also lead to the state that
> the file is not within the RAMDirectory but only on harddisk.
> Example:
>
>
>
> Filename _1.fdx Context = READ (file is on harddisk but not in
> RAMDirectory)
>
>
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openChecksumInput:157, Directory (org.apache.lucene.store)
>
> write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)
>
> createCompoundFile:5316, IndexWriter (org.apache.lucene.index)
>
> sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)
>
> flush:395, DocumentsWriterPerThread (org.apache.lucene.index)
>
> doFlush:476, DocumentsWriter (org.apache.lucene.index)
>
> flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
>
> getReader:605, IndexWriter (org.apache.lucene.index)
>
> doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)
>
> doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)
>
> openIfChanged:235, DirectoryReader (org.apache.lucene.index)
>
> Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt,
> Germany – Registergericht/Commercial register: Darmstadt HRB 1562 -
> Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr.
> Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg -
> Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz
> Streibich - http://www.softwareag.com
>
RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) [ In reply to ]
Hi,

thanks for reaching me that fast!

Your hint that there were changes to NRTCachingDirectory were the right point:

I copied the 8.3 NRTCachingDirectory implementation into the project (with a different classname, you get the idea???? ) and used that one.

And believe it or not: everything is fine. Now 8.8 performs as fast as 8.3!

I will check the differences and put them in step by step to find out which change causes the slow-down.
I’ll report here.

Bye,

Markus


From: Michael McCandless <lucene@mikemccandless.com>
Sent: Wednesday, 19 May 2021 13:39
To: Lucene Users <java-user@lucene.apache.org>; Gietzen, Markus <Markus.Gietzen@softwareag.com>
Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

> The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish.

Hmm, that sounds bad. We need our tests to stay fast but also do a good job testing things ;)

Does your production usage also slow down? Tests do other interesting things, e.g. enable assertions, randomly swap in different codecs, different Directory impls (if you are using Lucene's randomized test infrastructure<http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>), etc.

> While in 8.8 files are opened for reading that do not (yet) exist.

That is incredibly strange! Lucene should never do that (opening a non-existing file for "read", causing that file to become created through Windows CreateFile), that I am aware of. Can you share the full source code of the test case?

Try to eliminate parts of the test maybe? Do you see the same slowdown if you don't use NRTCachingDirectory at all? (Just straight to MMapDirectory.) There were some recent fixes to NRTCachineDirectory, e.g. https://issues.apache.org/jira/browse/LUCENE-9115 and https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are related?

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus <Markus.Gietzen@softwareag.com.invalid<mailto:Markus.Gietzen@softwareag.com.invalid>> wrote:
Hello,

recently I updated the Lucene version in one of our products from 8.3 to 8.8.x (8.8.2 as of now).
The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish.

So I took a closer look at one test-case which showed a severe slowdown
(it’s doing small update, flush, search cycles in order to stress NRT; the purpose is to see performance-changes in an early stage ???? ):

Lucene 8.3: ~2,3s
Lucene 8.8.x: 25s

This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and do a comparison.

The gap is caused by different amount of calls to sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) WindowsNativeDispatcher.java (native)
8.3: about 150 calls
8.8: about 12500 calls

In order to hunt down what is causing this, I took a look at the open() in NRTDirectory.
Here I could see that the amount of calls to that open is in the same ballpark for 8.3 and 8.8

The difference is that in 8.3 nearly all files are available in the underlying RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist.
This leads to a call to the WindowsNativeDispatcher.CreateFile0

Add the end of the mail I added two example-stacktraces that show this behavior.

Has someone an idea what change might cause this or if I need to do something different in 8.8 compared to 8.3?


Thanks for any help,

Markus

Here is an example stacktrace that is causing such a try of a read-access to non-existing file:

Filename= _0.fdm (IOContext is READ) (I checked the directory on harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir)

openInput:100, FilterDirectory (org.apache.lucene.store)
openInput:100, FilterDirectory (org.apache.lucene.store)
openChecksumInput:157, Directory (org.apache.lucene.store)
finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing)
flush:81, StoredFieldsConsumer (org.apache.lucene.index)
flush:239, DefaultIndexingChain (org.apache.lucene.index)
flush:350, DocumentsWriterPerThread (org.apache.lucene.index)
doFlush:476, DocumentsWriter (org.apache.lucene.index)
flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
getReader:605, IndexWriter (org.apache.lucene.index)
doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index)
openIfChanged:235, DirectoryReader (org.apache.lucene.index)


In a consequence later accesses to such files also lead to the state that the file is not within the RAMDirectory but only on harddisk.
Example:



Filename _1.fdx Context = READ (file is on harddisk but not in RAMDirectory)



openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openChecksumInput:157, Directory (org.apache.lucene.store)

write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)

createCompoundFile:5316, IndexWriter (org.apache.lucene.index)

sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)

flush:395, DocumentsWriterPerThread (org.apache.lucene.index)

doFlush:476, DocumentsWriter (org.apache.lucene.index)

flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)

getReader:605, IndexWriter (org.apache.lucene.index)

doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)

doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)

openIfChanged:235, DirectoryReader (org.apache.lucene.index)

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz Streibich - http://www.softwareag.com
Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) [ In reply to ]
LUCENE-9115 certainly creates more files in the FSDirectory than in the
ByteBuffersDirectory, e.g. stored fields are now always flushed to the
FSDirectory since their size can't be known in advance, while they were
always written to the ByteBuffersDirectory before (which was a big since
these files could be arbitrary large).

On Wed, May 19, 2021 at 1:55 PM Gietzen, Markus
<Markus.Gietzen@softwareag.com.invalid> wrote:

> Hi,
>
> thanks for reaching me that fast!
>
> Your hint that there were changes to NRTCachingDirectory were the right
> point:
>
> I copied the 8.3 NRTCachingDirectory implementation into the project (with
> a different classname, you get the idea???? ) and used that one.
>
> And believe it or not: everything is fine. Now 8.8 performs as fast as 8.3!
>
> I will check the differences and put them in step by step to find out
> which change causes the slow-down.
> I’ll report here.
>
> Bye,
>
> Markus
>
>
> From: Michael McCandless <lucene@mikemccandless.com>
> Sent: Wednesday, 19 May 2021 13:39
> To: Lucene Users <java-user@lucene.apache.org>; Gietzen, Markus <
> Markus.Gietzen@softwareag.com>
> Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from
> 8.3.0)
>
> > The update showed no issues (e.g. compiled without changes) but I
> noticed that our test-suites take a lot longer to finish.
>
> Hmm, that sounds bad. We need our tests to stay fast but also do a good
> job testing things ;)
>
> Does your production usage also slow down? Tests do other interesting
> things, e.g. enable assertions, randomly swap in different codecs,
> different Directory impls (if you are using Lucene's randomized test
> infrastructure<
> http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>),
> etc.
>
> > While in 8.8 files are opened for reading that do not (yet) exist.
>
> That is incredibly strange! Lucene should never do that (opening a
> non-existing file for "read", causing that file to become created through
> Windows CreateFile), that I am aware of. Can you share the full source
> code of the test case?
>
> Try to eliminate parts of the test maybe? Do you see the same slowdown if
> you don't use NRTCachingDirectory at all? (Just straight to
> MMapDirectory.) There were some recent fixes to NRTCachineDirectory, e.g.
> https://issues.apache.org/jira/browse/LUCENE-9115 and
> https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are
> related?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus
> <Markus.Gietzen@softwareag.com.invalid<mailto:
> Markus.Gietzen@softwareag.com.invalid>> wrote:
> Hello,
>
> recently I updated the Lucene version in one of our products from 8.3 to
> 8.8.x (8.8.2 as of now).
> The update showed no issues (e.g. compiled without changes) but I noticed
> that our test-suites take a lot longer to finish.
>
> So I took a closer look at one test-case which showed a severe slowdown
> (it’s doing small update, flush, search cycles in order to stress NRT;
> the purpose is to see performance-changes in an early stage ???? ):
>
> Lucene 8.3: ~2,3s
> Lucene 8.8.x: 25s
>
> This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8
> and do a comparison.
>
> The gap is caused by different amount of calls to
> sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int,
> int) WindowsNativeDispatcher.java (native)
> 8.3: about 150 calls
> 8.8: about 12500 calls
>
> In order to hunt down what is causing this, I took a look at the open() in
> NRTDirectory.
> Here I could see that the amount of calls to that open is in the same
> ballpark for 8.3 and 8.8
>
> The difference is that in 8.3 nearly all files are available in the
> underlying RAMDirectory. While in 8.8 files are opened for reading that do
> not (yet) exist.
> This leads to a call to the WindowsNativeDispatcher.CreateFile0
>
> Add the end of the mail I added two example-stacktraces that show this
> behavior.
>
> Has someone an idea what change might cause this or if I need to do
> something different in 8.8 compared to 8.3?
>
>
> Thanks for any help,
>
> Markus
>
> Here is an example stacktrace that is causing such a try of a read-access
> to non-existing file:
>
> Filename= _0.fdm (IOContext is READ) (I checked the directory on
> harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
> openInput:100, FilterDirectory (org.apache.lucene.store)
> openChecksumInput:157, Directory (org.apache.lucene.store)
> finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
> finish:480, CompressingStoredFieldsWriter
> (org.apache.lucene.codecs.compressing)
> flush:81, StoredFieldsConsumer (org.apache.lucene.index)
> flush:239, DefaultIndexingChain (org.apache.lucene.index)
> flush:350, DocumentsWriterPerThread (org.apache.lucene.index)
> doFlush:476, DocumentsWriter (org.apache.lucene.index)
> flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
> getReader:605, IndexWriter (org.apache.lucene.index)
> doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index)
> openIfChanged:235, DirectoryReader (org.apache.lucene.index)
>
>
> In a consequence later accesses to such files also lead to the state that
> the file is not within the RAMDirectory but only on harddisk.
> Example:
>
>
>
> Filename _1.fdx Context = READ (file is on harddisk but not in
> RAMDirectory)
>
>
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openInput:100, FilterDirectory (org.apache.lucene.store)
>
> openChecksumInput:157, Directory (org.apache.lucene.store)
>
> write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)
>
> createCompoundFile:5316, IndexWriter (org.apache.lucene.index)
>
> sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)
>
> flush:395, DocumentsWriterPerThread (org.apache.lucene.index)
>
> doFlush:476, DocumentsWriter (org.apache.lucene.index)
>
> flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
>
> getReader:605, IndexWriter (org.apache.lucene.index)
>
> doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)
>
> doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)
>
> openIfChanged:235, DirectoryReader (org.apache.lucene.index)
>
> Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt,
> Germany – Registergericht/Commercial register: Darmstadt HRB 1562 -
> Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr.
> Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg -
> Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz
> Streibich - http://www.softwareag.com
>


--
Adrien
RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) [ In reply to ]
Hi again,

I found the difference causing the slow down.

It's NRTCachingDirectory#doCacheWrite method.

With the implementation of 8.8 it's slow. With the version of 8.3 it's fast.

Hope it helps,

Markus


-----Original Message-----
From: Gietzen, Markus <Markus.Gietzen@softwareag.com.INVALID>
Sent: Wednesday, 19 May 2021 13:55
To: Michael McCandless <lucene@mikemccandless.com>; Lucene Users <java-user@lucene.apache.org>
Subject: RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

Hi,

thanks for reaching me that fast!

Your hint that there were changes to NRTCachingDirectory were the right point:

I copied the 8.3 NRTCachingDirectory implementation into the project (with a different classname, you get the idea???? ) and used that one.

And believe it or not: everything is fine. Now 8.8 performs as fast as 8.3!

I will check the differences and put them in step by step to find out which change causes the slow-down.
I’ll report here.

Bye,

Markus


From: Michael McCandless <lucene@mikemccandless.com>
Sent: Wednesday, 19 May 2021 13:39
To: Lucene Users <java-user@lucene.apache.org>; Gietzen, Markus <Markus.Gietzen@softwareag.com>
Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

> The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish.

Hmm, that sounds bad. We need our tests to stay fast but also do a good job testing things ;)

Does your production usage also slow down? Tests do other interesting things, e.g. enable assertions, randomly swap in different codecs, different Directory impls (if you are using Lucene's randomized test infrastructure<http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>), etc.

> While in 8.8 files are opened for reading that do not (yet) exist.

That is incredibly strange! Lucene should never do that (opening a non-existing file for "read", causing that file to become created through Windows CreateFile), that I am aware of. Can you share the full source code of the test case?

Try to eliminate parts of the test maybe? Do you see the same slowdown if you don't use NRTCachingDirectory at all? (Just straight to MMapDirectory.) There were some recent fixes to NRTCachineDirectory, e.g. https://issues.apache.org/jira/browse/LUCENE-9115 and https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are related?

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus <Markus.Gietzen@softwareag.com.invalid<mailto:Markus.Gietzen@softwareag.com.invalid>> wrote:
Hello,

recently I updated the Lucene version in one of our products from 8.3 to 8.8.x (8.8.2 as of now).
The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish.

So I took a closer look at one test-case which showed a severe slowdown (it’s doing small update, flush, search cycles in order to stress NRT; the purpose is to see performance-changes in an early stage ???? ):

Lucene 8.3: ~2,3s
Lucene 8.8.x: 25s

This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and do a comparison.

The gap is caused by different amount of calls to sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) WindowsNativeDispatcher.java (native)
8.3: about 150 calls
8.8: about 12500 calls

In order to hunt down what is causing this, I took a look at the open() in NRTDirectory.
Here I could see that the amount of calls to that open is in the same ballpark for 8.3 and 8.8

The difference is that in 8.3 nearly all files are available in the underlying RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist.
This leads to a call to the WindowsNativeDispatcher.CreateFile0

Add the end of the mail I added two example-stacktraces that show this behavior.

Has someone an idea what change might cause this or if I need to do something different in 8.8 compared to 8.3?


Thanks for any help,

Markus

Here is an example stacktrace that is causing such a try of a read-access to non-existing file:

Filename= _0.fdm (IOContext is READ) (I checked the directory on harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir)

openInput:100, FilterDirectory (org.apache.lucene.store) openInput:100, FilterDirectory (org.apache.lucene.store) openChecksumInput:157, Directory (org.apache.lucene.store) finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing)
flush:81, StoredFieldsConsumer (org.apache.lucene.index) flush:239, DefaultIndexingChain (org.apache.lucene.index) flush:350, DocumentsWriterPerThread (org.apache.lucene.index) doFlush:476, DocumentsWriter (org.apache.lucene.index) flushAllThreads:656, DocumentsWriter (org.apache.lucene.index) getReader:605, IndexWriter (org.apache.lucene.index) doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index) openIfChanged:235, DirectoryReader (org.apache.lucene.index)


In a consequence later accesses to such files also lead to the state that the file is not within the RAMDirectory but only on harddisk.
Example:



Filename _1.fdx Context = READ (file is on harddisk but not in RAMDirectory)



openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openChecksumInput:157, Directory (org.apache.lucene.store)

write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)

createCompoundFile:5316, IndexWriter (org.apache.lucene.index)

sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)

flush:395, DocumentsWriterPerThread (org.apache.lucene.index)

doFlush:476, DocumentsWriter (org.apache.lucene.index)

flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)

getReader:605, IndexWriter (org.apache.lucene.index)

doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)

doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)

openIfChanged:235, DirectoryReader (org.apache.lucene.index)

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz Streibich - http://www.softwareag.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org