Mailing List Archive

Blog post - Profiling the Lucene nightly benchmarks
Hello everyone!

I recently wrote a blog post which looks into profiling data of the Lucene
nightl benchmarks. I emailed Michael McCandless (the maintainer of the
benchmarks) and he suggested that I post about it here, so here we go.

The post is available at https://blunders.io/posts/lucene-bench-2021-01-10.
I have published some more periodic profiling data at
https://blunders.io/lucene-bench - this is not really nightly, but one
might be able to spot changes over time.

If you have any feedback or questions, I'll happily listen and answer.

best regards,
Anton Hägerstrand

PS. If no one beats me too it, I'll open a PR for the TermGroupSelector
thing ;)
Re: Blog post - Profiling the Lucene nightly benchmarks [ In reply to ]
This is very cool, thanks for sharing Anton!

Le ven. 15 janv. 2021 à 23:40, Anton Hägerstrand <anton@blunders.io> a
écrit :

> Hello everyone!
>
> I recently wrote a blog post which looks into profiling data of the Lucene
> nightl benchmarks. I emailed Michael McCandless (the maintainer of the
> benchmarks) and he suggested that I post about it here, so here we go.
>
> The post is available at https://blunders.io/posts/lucene-bench-2021-01-10.
> I have published some more periodic profiling data at
> https://blunders.io/lucene-bench - this is not really nightly, but one
> might be able to spot changes over time.
>
> If you have any feedback or questions, I'll happily listen and answer.
>
> best regards,
> Anton Hägerstrand
>
> PS. If no one beats me too it, I'll open a PR for the TermGroupSelector
> thing ;)
>
Re: Blog post - Profiling the Lucene nightly benchmarks [ In reply to ]
Indeed! Thank you for all the helpful suggestions, especially from my
point of view re: HNSW, which is indeed costly to index. I am
surprised how much time is spent in SparseBitSet; perhaps a full
(non-sparse) bitset is called for, although I had initially shied away
from it since this indexing is already quite RAM-intensive. Also, I
did not know about Math.fma, I wonder if we can speed up dot-product
with it. And your observation about the vector indexing dominating the
indexing benchmark is fair - we may want to consider indexing vectors
more sparsely to trim that.

On Sat, Jan 16, 2021 at 5:18 AM Adrien Grand <jpountz@gmail.com> wrote:
>
> This is very cool, thanks for sharing Anton!
>
> Le ven. 15 janv. 2021 à 23:40, Anton Hägerstrand <anton@blunders.io> a écrit :
>>
>> Hello everyone!
>>
>> I recently wrote a blog post which looks into profiling data of the Lucene nightl benchmarks. I emailed Michael McCandless (the maintainer of the benchmarks) and he suggested that I post about it here, so here we go.
>>
>> The post is available at https://blunders.io/posts/lucene-bench-2021-01-10. I have published some more periodic profiling data at https://blunders.io/lucene-bench - this is not really nightly, but one might be able to spot changes over time.
>>
>> If you have any feedback or questions, I'll happily listen and answer.
>>
>> best regards,
>> Anton Hägerstrand
>>
>> PS. If no one beats me too it, I'll open a PR for the TermGroupSelector thing ;)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Blog post - Profiling the Lucene nightly benchmarks [ In reply to ]
Thank you for sharing, Anton! Benchmarking reveals all sorts of fun
things, and those flame charts are great fun.

I also added very basic JFR profiling results to nightly benchmarks (thanks
to Robert for the idea and pointers). Not nearly as pretty and interactive
as flame charts :)

E.g. see last night's run:
https://home.apache.org/~mikemccand/lucenebench/2021.01.19.00.03.46.html
(includes cpu and heap, for the multiple indices built by nightly
benchmarks, and then also search results aggregated from the 20 JVM
iterations).

Building FSTs is the top hot-spot in allocations (heap), followed by
SparseBitSet.insertLong.

Mike McCandless

http://blog.mikemccandless.com


On Sun, Jan 17, 2021 at 7:15 PM Michael Sokolov <msokolov@gmail.com> wrote:

> Indeed! Thank you for all the helpful suggestions, especially from my
> point of view re: HNSW, which is indeed costly to index. I am
> surprised how much time is spent in SparseBitSet; perhaps a full
> (non-sparse) bitset is called for, although I had initially shied away
> from it since this indexing is already quite RAM-intensive. Also, I
> did not know about Math.fma, I wonder if we can speed up dot-product
> with it. And your observation about the vector indexing dominating the
> indexing benchmark is fair - we may want to consider indexing vectors
> more sparsely to trim that.
>
> On Sat, Jan 16, 2021 at 5:18 AM Adrien Grand <jpountz@gmail.com> wrote:
> >
> > This is very cool, thanks for sharing Anton!
> >
> > Le ven. 15 janv. 2021 à 23:40, Anton Hägerstrand <anton@blunders.io> a
> écrit :
> >>
> >> Hello everyone!
> >>
> >> I recently wrote a blog post which looks into profiling data of the
> Lucene nightl benchmarks. I emailed Michael McCandless (the maintainer of
> the benchmarks) and he suggested that I post about it here, so here we go.
> >>
> >> The post is available at
> https://blunders.io/posts/lucene-bench-2021-01-10. I have published some
> more periodic profiling data at https://blunders.io/lucene-bench - this
> is not really nightly, but one might be able to spot changes over time.
> >>
> >> If you have any feedback or questions, I'll happily listen and answer.
> >>
> >> best regards,
> >> Anton Hägerstrand
> >>
> >> PS. If no one beats me too it, I'll open a PR for the TermGroupSelector
> thing ;)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: Blog post - Profiling the Lucene nightly benchmarks [ In reply to ]
Thanks to Anton exposing a simple API on Blunders.io, I've integrated this
into our nightly benchmarks, and now each night, Lucene benchmarks upload
JFR files from indexing and searching tasks, producing flame charts that we
all can interact with to find unexpected hot spots!

I've linked to the Blunders.io site from index.html on our nightly
benchmark (search for blunders.io, near the top):
https://home.apache.org/~mikemccand/lucenebench

And it links to this entry page, holding all of our nightly profiler
results: https://blunders.io/lucene-bench

E.g. last night's flame chart for indexing all English Wikipedia ~4KB avg
sized docs:
https://blunders.io/jfr-demo/indexing-4kb-2021.01.26.00.04.35/top_down_cpu_samples?endTime=1611640955698&startTime=1611640825550

And from all search tasks (warning: it's kinda slow, I think because we run
20 JVM iterations for search tasks and then concatenate all the resulting
JFRs):
https://blunders.io/jfr-demo/searching-2021.01.26.00.04.35/top_down_cpu_samples?endTime=1611649189073&startTime=1611647421135

Eventually we will need to delete some of them :) I haven't quite
implemented that just yet ...

Thank you Anton, this is very cool! I am looking forward to the first
optimization we make in Lucene after finding a surprising hotspot in these
flame charts!

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jan 19, 2021 at 9:23 AM Michael McCandless <
lucene@mikemccandless.com> wrote:

> Thank you for sharing, Anton! Benchmarking reveals all sorts of fun
> things, and those flame charts are great fun.
>
> I also added very basic JFR profiling results to nightly benchmarks
> (thanks to Robert for the idea and pointers). Not nearly as pretty and
> interactive as flame charts :)
>
> E.g. see last night's run:
> https://home.apache.org/~mikemccand/lucenebench/2021.01.19.00.03.46.html
> (includes cpu and heap, for the multiple indices built by nightly
> benchmarks, and then also search results aggregated from the 20 JVM
> iterations).
>
> Building FSTs is the top hot-spot in allocations (heap), followed by
> SparseBitSet.insertLong.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Jan 17, 2021 at 7:15 PM Michael Sokolov <msokolov@gmail.com>
> wrote:
>
>> Indeed! Thank you for all the helpful suggestions, especially from my
>> point of view re: HNSW, which is indeed costly to index. I am
>> surprised how much time is spent in SparseBitSet; perhaps a full
>> (non-sparse) bitset is called for, although I had initially shied away
>> from it since this indexing is already quite RAM-intensive. Also, I
>> did not know about Math.fma, I wonder if we can speed up dot-product
>> with it. And your observation about the vector indexing dominating the
>> indexing benchmark is fair - we may want to consider indexing vectors
>> more sparsely to trim that.
>>
>> On Sat, Jan 16, 2021 at 5:18 AM Adrien Grand <jpountz@gmail.com> wrote:
>> >
>> > This is very cool, thanks for sharing Anton!
>> >
>> > Le ven. 15 janv. 2021 à 23:40, Anton Hägerstrand <anton@blunders.io> a
>> écrit :
>> >>
>> >> Hello everyone!
>> >>
>> >> I recently wrote a blog post which looks into profiling data of the
>> Lucene nightl benchmarks. I emailed Michael McCandless (the maintainer of
>> the benchmarks) and he suggested that I post about it here, so here we go.
>> >>
>> >> The post is available at
>> https://blunders.io/posts/lucene-bench-2021-01-10. I have published some
>> more periodic profiling data at https://blunders.io/lucene-bench - this
>> is not really nightly, but one might be able to spot changes over time.
>> >>
>> >> If you have any feedback or questions, I'll happily listen and answer.
>> >>
>> >> best regards,
>> >> Anton Hägerstrand
>> >>
>> >> PS. If no one beats me too it, I'll open a PR for the
>> TermGroupSelector thing ;)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>