Mailing List Archive

Fwd: TermsEnum.seekExact degraded performance somewhere between Lucene 7.7.0 and 8.5.1.
Hi all.

I've been tracking down slow seeking performance in TermsEnum after
updating to Lucene 8.5.1.

On 8.5.1:

SegmentTermsEnum.seekExact: 33,829 ms (70.2%) (remaining time in our code)
SegmentTermsEnumFrame.loadBlock: 29,104 ms (60.4%)
CompressionAlgorithm$2.read: 25,789 ms (53.5%)
LowercaseAsciiCompression.decompress: 25,789 ms (53.5%)
DataInput.readVInt: 24,690 ms (51.2%)
SegmentTermsEnumFrame.scanToTerm: 2,921 ms (6.1%)

On 7.7.0 (previous version we were using):

SegmentTermsEnum.seekExact: 5,897 ms (43.7%) (remaining time in our code)
SegmentTermsEnumFrame.loadBlock: 3,499 ms (25.9%)
BufferedIndexInput.readBytes: 1,500 ms (11.1%)
DataInput.readVInt: 1,108 (8.2%)
SegmentTermsEnumFrame.scanToTerm: 1,501 ms (11.1%)

So on the surface it sort of looks like the new version spends less
time scanning and much more time loading blocks to decompress?

Looking for some clues to what might have changed here, and whether
it's something we can avoid, but currently LUCENE-4702 looks like it
may be related.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org