Mailing List Archive

Lucene 10
Hello everyone!

It's been ~2.5 years since we released Lucene 9.0 (December 2021) and I'd
like us to start working towards Lucene 10.0. I'm volunteering for being
the release manager and propose the following timeline:
- ~September 15th: main gets bumped to 11.x, branch_10x gets created
- ~September 22nd: Do a last 9.x minor release.
- ~October 1st: Release 10.0.

This may sound like a long notice period. My motivation is that there are a
few changes I have on my mind that are likely worthy of a major release,
and I plan on taking advantage of a date being set to stop procrastinating
and finally start moving these enhancements forward. These are not
blockers, only my wish list for Lucene 10.0, if they are not ready in time
we can have discussions about letting them slip until the next major.
- Greater I/O concurrency <https://github.com/apache/lucene/issues/13179>.
Can Lucene better utilize modern disks that are plenty concurrent?
- Decouple search concurrency from index geometry
<https://github.com/apache/lucene/issues/9721>. Can Lucene better utilize
modern CPUs that are plenty concurrent?
- "Sparse indexing <https://github.com/apache/lucene/issues/11432>" /
"zone indexing" for sorted indexes. This is one of the most efficient
techniques that OLAP databases take advantage of to make search fast. Let's
bring it to Lucene.

This list isn't meant to be an exhaustive list of release highlights for
Lucene 10, feel free to add your own. There are also a number of cleanups
we may want to consider. I wanted to share this list for visibility though
in case you have thoughts on these enhancements and/or would like to help.

--
Adrien
Re: Lucene 10 [ In reply to ]
timing makes sense to me. +1 for having a deadline to reduce
procrastination, but Adrien I don't honestly believe anyone who is
paying attention thinks that is what you have been doing!

On Wed, Mar 13, 2024 at 10:40?AM Adrien Grand <jpountz@gmail.com> wrote:
>
> Hello everyone!
>
> It's been ~2.5 years since we released Lucene 9.0 (December 2021) and I'd like us to start working towards Lucene 10.0. I'm volunteering for being the release manager and propose the following timeline:
> - ~September 15th: main gets bumped to 11.x, branch_10x gets created
> - ~September 22nd: Do a last 9.x minor release.
> - ~October 1st: Release 10.0.
>
> This may sound like a long notice period. My motivation is that there are a few changes I have on my mind that are likely worthy of a major release, and I plan on taking advantage of a date being set to stop procrastinating and finally start moving these enhancements forward. These are not blockers, only my wish list for Lucene 10.0, if they are not ready in time we can have discussions about letting them slip until the next major.
> - Greater I/O concurrency. Can Lucene better utilize modern disks that are plenty concurrent?
> - Decouple search concurrency from index geometry. Can Lucene better utilize modern CPUs that are plenty concurrent?
> - "Sparse indexing" / "zone indexing" for sorted indexes. This is one of the most efficient techniques that OLAP databases take advantage of to make search fast. Let's bring it to Lucene.
>
> This list isn't meant to be an exhaustive list of release highlights for Lucene 10, feel free to add your own. There are also a number of cleanups we may want to consider. I wanted to share this list for visibility though in case you have thoughts on these enhancements and/or would like to help.
>
> --
> Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Lucene 10 [ In reply to ]
Thanks Adrien +1 to the timelines.

I'm also willing to work on/ review the Decouple search concurrency from
index geometry <https://github.com/apache/lucene/issues/9721> task,
that's a very nice one to have for those latency sensitive applications
(rather than have to tune
merge policy case by case). But I cannot guarantee anything yet so if
others are also
working on it I'm happy to share the ideas/ efforts (if any).

Patrick

On Thu, Mar 14, 2024 at 12:09?PM Michael Sokolov <msokolov@gmail.com> wrote:

> timing makes sense to me. +1 for having a deadline to reduce
> procrastination, but Adrien I don't honestly believe anyone who is
> paying attention thinks that is what you have been doing!
>
> On Wed, Mar 13, 2024 at 10:40?AM Adrien Grand <jpountz@gmail.com> wrote:
> >
> > Hello everyone!
> >
> > It's been ~2.5 years since we released Lucene 9.0 (December 2021) and
> I'd like us to start working towards Lucene 10.0. I'm volunteering for
> being the release manager and propose the following timeline:
> > - ~September 15th: main gets bumped to 11.x, branch_10x gets created
> > - ~September 22nd: Do a last 9.x minor release.
> > - ~October 1st: Release 10.0.
> >
> > This may sound like a long notice period. My motivation is that there
> are a few changes I have on my mind that are likely worthy of a major
> release, and I plan on taking advantage of a date being set to stop
> procrastinating and finally start moving these enhancements forward. These
> are not blockers, only my wish list for Lucene 10.0, if they are not ready
> in time we can have discussions about letting them slip until the next
> major.
> > - Greater I/O concurrency. Can Lucene better utilize modern disks that
> are plenty concurrent?
> > - Decouple search concurrency from index geometry. Can Lucene better
> utilize modern CPUs that are plenty concurrent?
> > - "Sparse indexing" / "zone indexing" for sorted indexes. This is one
> of the most efficient techniques that OLAP databases take advantage of to
> make search fast. Let's bring it to Lucene.
> >
> > This list isn't meant to be an exhaustive list of release highlights for
> Lucene 10, feel free to add your own. There are also a number of cleanups
> we may want to consider. I wanted to share this list for visibility though
> in case you have thoughts on these enhancements and/or would like to help.
> >
> > --
> > Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: Lucene 10 [ In reply to ]
Hey Patrick,
your help on search concurrency will be much appreciated :) I have some
very hacky branch that I'd like to use as a base for discussion of
the issues I found and needed adjustments. Lots to do there. I will ping
you once I put up a draft PR.

Cheers
Luca

On Fri, Mar 15, 2024 at 9:55?PM Patrick Zhai <zhai7631@gmail.com> wrote:

> Thanks Adrien +1 to the timelines.
>
> I'm also willing to work on/ review the Decouple search concurrency from
> index geometry <https://github.com/apache/lucene/issues/9721> task,
> that's a very nice one to have for those latency sensitive applications
> (rather than have to tune
> merge policy case by case). But I cannot guarantee anything yet so if
> others are also
> working on it I'm happy to share the ideas/ efforts (if any).
>
> Patrick
>
> On Thu, Mar 14, 2024 at 12:09?PM Michael Sokolov <msokolov@gmail.com>
> wrote:
>
>> timing makes sense to me. +1 for having a deadline to reduce
>> procrastination, but Adrien I don't honestly believe anyone who is
>> paying attention thinks that is what you have been doing!
>>
>> On Wed, Mar 13, 2024 at 10:40?AM Adrien Grand <jpountz@gmail.com> wrote:
>> >
>> > Hello everyone!
>> >
>> > It's been ~2.5 years since we released Lucene 9.0 (December 2021) and
>> I'd like us to start working towards Lucene 10.0. I'm volunteering for
>> being the release manager and propose the following timeline:
>> > - ~September 15th: main gets bumped to 11.x, branch_10x gets created
>> > - ~September 22nd: Do a last 9.x minor release.
>> > - ~October 1st: Release 10.0.
>> >
>> > This may sound like a long notice period. My motivation is that there
>> are a few changes I have on my mind that are likely worthy of a major
>> release, and I plan on taking advantage of a date being set to stop
>> procrastinating and finally start moving these enhancements forward. These
>> are not blockers, only my wish list for Lucene 10.0, if they are not ready
>> in time we can have discussions about letting them slip until the next
>> major.
>> > - Greater I/O concurrency. Can Lucene better utilize modern disks that
>> are plenty concurrent?
>> > - Decouple search concurrency from index geometry. Can Lucene better
>> utilize modern CPUs that are plenty concurrent?
>> > - "Sparse indexing" / "zone indexing" for sorted indexes. This is one
>> of the most efficient techniques that OLAP databases take advantage of to
>> make search fast. Let's bring it to Lucene.
>> >
>> > This list isn't meant to be an exhaustive list of release highlights
>> for Lucene 10, feel free to add your own. There are also a number of
>> cleanups we may want to consider. I wanted to share this list for
>> visibility though in case you have thoughts on these enhancements and/or
>> would like to help.
>> >
>> > --
>> > Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
Re: Lucene 10 [ In reply to ]
>
> [...] but Adrien I don't honestly believe anyone who is
> paying attention thinks that is what you have been doing!


+1. I wish I were procrastinating as productively!

D.
Re: Lucene 10 [ In reply to ]
Thanks Mike and Dawid for the kind words, and thanks Patrick, Luca and Egor
for your interest in decoupling index geometry from search concurrency,
this would be a great release highlight if we can get it into Lucene 10!

I haven't seen pushback on the proposed schedule so I plan on proceeding
with this timeline in mind.

If you have changes that you would like to include in Lucene 10.0, please
add the 10.0 milestone <https://github.com/apache/lucene/milestones/10.0.0>
to them. It's ok to be a bit ambitious at this stage and
optimistically mark some changes as scheduled for 10.0, we'll have
opportunities for removing items from this list when the date comes closer
and some issues are not getting proper traction. I'll take care of that.

On Mon, Mar 18, 2024 at 11:39?AM Dawid Weiss <dawid.weiss@gmail.com> wrote:

> [...] but Adrien I don't honestly believe anyone who is
>> paying attention thinks that is what you have been doing!
>
>
> +1. I wish I were procrastinating as productively!
>
> D.
>


--
Adrien