Mailing List Archive: Lucene JMH benchmarks

This is not a proposal to add one more benchmark module to Lucene, but I've
been digging into simpler ways to explore luceneutil benchmark behavior via
a home game that allows simpler rules and introspection.

I've got a couple half baked straw-men already that I have been playing
with for such an experience, but there are various issues that have kept
them a bit out of solid reach as of yet.

In somewhat of a plan b mode that feels to me like a plan better than A
mode, I already have some JMH boiler plate and recent time on JMH work, and
as I'm essentially looking for the JMH experience, I've whipped up the
groundwork to get it.

I'm a big fan of JMH and it's various profilers and functionality.

So I've got an early home game version up here if anyone else has ever had
such an itch: https://github.com/apache/lucene/pull/365

It can run a similar style of benchmark as luceneutil, using a luceneutil
created index and task file. And as an example of more general lucene micro
benchmark options there is a simple, contrived FuzzyQuery benchmark to play
with.
More details in the draft PR (which is just a nice home for it at the
moment), such as how to run the luceneutil approximate. There is also a
slim, early README here:
https://github.com/markrmiller/lucene/blob/JMH/lucene/jmh/README.md

From the PR, the easy getting started cut and paste is below.

If you are a JMH fan like I am, check it out and enjoy. It's a small
experiment, but allows for a decent amount of play.

Mark

git clone -b JMH --single-branch https://github.com/markrmiller/lucene.git
cd lucene/lucene/jmh
./jmh.sh FuzzyQuery

// async-profiler flame graphs to lucene/lucene/jmh/work/*, 1 warm up
iteration, 1 iteration, 1 second each, measures throughput
https://github.com/markrmiller/lucene/blob/JMH/lucene/jmh/README.md#using-jmh-with-async-profiler
./jmh.sh FuzzyQuery -w 1 -wi 1 -r 1 -i 1 -prof
async:dir=work;output=flamegraph

I may have spoke a little too soon about lack of ambition for yet another
benchmark module. I was mainly after some simple stuff for myself with
likely minimal time or effort so I dropped this a bit as, hey if you want
to do some jmh with Lucene there is this stuff that will be around that
will get you off the ground.

Hang on though. I’ve found a low cost path that I think might lead to a
small bit of traction. I had no personal interest, especially with few
resources for it, in just building another iteration of the benchmark
module. That’s meh to me. I think a jmh benchmark module could be so much
more interesting and valuable than that. But interest and value is as much
in the use as the innate qualities of something.

I think I can see a path that could lea led to a small sprout here though.
If I see Uwe in there someday (not janitor-ing) , I’ll feel it probably
made the jump.

So who knows. No proposal yet, but there may be something interesting that
pops up here before too long.

Mark

On Fri, Oct 8, 2021 at 7:08 PM Mark Miller <markrmiller@gmail.com> wrote:

> This is not a proposal to add one more benchmark module to Lucene, but
> I've been digging into simpler ways to explore luceneutil benchmark
> behavior via a home game that allows simpler rules and introspection.
>
> I've got a couple half baked straw-men already that I have been playing
> with for such an experience, but there are various issues that have kept
> them a bit out of solid reach as of yet.
>
> In somewhat of a plan b mode that feels to me like a plan better than A
> mode, I already have some JMH boiler plate and recent time on JMH work, and
> as I'm essentially looking for the JMH experience, I've whipped up the
> groundwork to get it.
>
> I'm a big fan of JMH and it's various profilers and functionality.
>
> So I've got an early home game version up here if anyone else has ever had
> such an itch: https://github.com/apache/lucene/pull/365
>
> It can run a similar style of benchmark as luceneutil, using a luceneutil
> created index and task file. And as an example of more general lucene micro
> benchmark options there is a simple, contrived FuzzyQuery benchmark to play
> with.
> More details in the draft PR (which is just a nice home for it at the
> moment), such as how to run the luceneutil approximate. There is also a
> slim, early README here:
> https://github.com/markrmiller/lucene/blob/JMH/lucene/jmh/README.md
>
> From the PR, the easy getting started cut and paste is below.
>
> If you are a JMH fan like I am, check it out and enjoy. It's a small
> experiment, but allows for a decent amount of play.
>
> Mark
>
> git clone -b JMH --single-branch https://github.com/markrmiller/lucene.git
> cd lucene/lucene/jmh
> ./jmh.sh FuzzyQuery
>
> // async-profiler flame graphs to lucene/lucene/jmh/work/*, 1 warm up
> iteration, 1 iteration, 1 second each, measures throughput
>
> https://github.com/markrmiller/lucene/blob/JMH/lucene/jmh/README.md#using-jmh-with-async-profiler
> ./jmh.sh FuzzyQuery -w 1 -wi 1 -r 1 -i 1 -prof
> async:dir=work;output=flamegraph
>
--
- MRM