Mailing List Archive

What is the status and what are the next steps re k-nn search?
Hi

If I understand correctly the source version of Lucene contains an HNSW
implementation for k-nn search

https://issues.apache.org/jira/browse/LUCENE-9004

and another algorithm based on coarse quantization is in development

https://issues.apache.org/jira/browse/LUCENE-9322

Also there are various efforts to benchmark the Lucene HNSW
implementation according to https://github.com/erikbern/ann-benchmarks

https://www.mail-archive.com/issues@lucene.apache.org/msg42711.html
https://github.com/jtibshirani/lucene/pull/1

and

https://github.com/alexklibisz/ann-benchmarks-lucene

But what is not so clear to me is what is the current status of these
developments and what are the next steps planned?

Thanks

Michael



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: What is the status and what are the next steps re k-nn search? [ In reply to ]
Hi Michael,

I'll let others closer to the ongoing vectors developments answer with more
technical detail, but one high level answer:

The new ANN/kNN feature is not yet released! It will be in Lucene 9.0.0,
when that is released (maybe soonish -- there is a running thread about
remaining 9.0 blockers <https://markmail.org/thread/sga3dzbymnn5qdx4>).
This is a major new feature, requiring a new Codec component, and will not
be backported to Lucene 8.x.

Also, we already benchmark indexing/searching of vectors every night
in Lucene's
nightly benchmarks <https://home.apache.org/~mikemccand/lucenebench/> (hmm,
they haven't run for the past few nights ... I'll try to fix).

Mike McCandless

http://blog.mikemccandless.com


On Sun, May 23, 2021 at 7:39 AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi
>
> If I understand correctly the source version of Lucene contains an HNSW
> implementation for k-nn search
>
> https://issues.apache.org/jira/browse/LUCENE-9004
>
> and another algorithm based on coarse quantization is in development
>
> https://issues.apache.org/jira/browse/LUCENE-9322
>
> Also there are various efforts to benchmark the Lucene HNSW
> implementation according to https://github.com/erikbern/ann-benchmarks
>
> https://www.mail-archive.com/issues@lucene.apache.org/msg42711.html
> https://github.com/jtibshirani/lucene/pull/1
>
> and
>
> https://github.com/alexklibisz/ann-benchmarks-lucene
>
> But what is not so clear to me is what is the current status of these
> developments and what are the next steps planned?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: What is the status and what are the next steps re k-nn search? [ In reply to ]
Hi Michael

Thanks very much for your feedback!

I have added a FAQ

https://cwiki.apache.org/confluence/display/LUCENE/LuceneFAQ#LuceneFAQ-DoesLucenesupportANN/kNNsearch?

and will update it when there will be more information available.

Thanks

Michael

Am 24.05.21 um 16:00 schrieb Michael McCandless:
> Hi Michael,
>
> I'll let others closer to the ongoing vectors developments answer with
> more technical detail, but one high level answer:
>
> The new ANN/kNN feature is not yet released!  It will be in Lucene
> 9.0.0, when that is released (maybe soonish -- there is a running
> thread about remaining 9.0 blockers
> <https://markmail.org/thread/sga3dzbymnn5qdx4>).  This is a major new
> feature, requiring a new Codec component, and will not be backported
> to Lucene 8.x.
>
> Also, we already benchmark indexing/searching of vectors every night
> in Lucene's nightly benchmarks
> <https://home.apache.org/~mikemccand/lucenebench/> (hmm, they haven't
> run for the past few nights ... I'll try to fix).
>
> Mike McCandless
>
> http://blog.mikemccandless.com <http://blog.mikemccandless.com>
>
>
> On Sun, May 23, 2021 at 7:39 AM Michael Wechner
> <michael.wechner@wyona.com <mailto:michael.wechner@wyona.com>> wrote:
>
> Hi
>
> If I understand correctly the source version of Lucene contains an
> HNSW
> implementation for k-nn search
>
> https://issues.apache.org/jira/browse/LUCENE-9004
> <https://issues.apache.org/jira/browse/LUCENE-9004>
>
> and another algorithm based on coarse quantization is in development
>
> https://issues.apache.org/jira/browse/LUCENE-9322
> <https://issues.apache.org/jira/browse/LUCENE-9322>
>
> Also there are various efforts to benchmark the Lucene HNSW
> implementation according to
> https://github.com/erikbern/ann-benchmarks
> <https://github.com/erikbern/ann-benchmarks>
>
> https://www.mail-archive.com/issues@lucene.apache.org/msg42711.html
> <https://www.mail-archive.com/issues@lucene.apache.org/msg42711.html>
> https://github.com/jtibshirani/lucene/pull/1
> <https://github.com/jtibshirani/lucene/pull/1>
>
> and
>
> https://github.com/alexklibisz/ann-benchmarks-lucene
> <https://github.com/alexklibisz/ann-benchmarks-lucene>
>
> But what is not so clear to me is what is the current status of these
> developments and what are the next steps planned?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> <mailto:dev-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: dev-help@lucene.apache.org
> <mailto:dev-help@lucene.apache.org>
>
Re: What is the status and what are the next steps re k-nn search? [ In reply to ]
Awesome, thanks Michael!

Mike McCandless

http://blog.mikemccandless.com


On Mon, May 24, 2021 at 2:57 PM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi Michael
>
> Thanks very much for your feedback!
>
> I have added a FAQ
>
>
> https://cwiki.apache.org/confluence/display/LUCENE/LuceneFAQ#LuceneFAQ-DoesLucenesupportANN/kNNsearch
> ?
>
> and will update it when there will be more information available.
>
> Thanks
>
> Michael
>
> Am 24.05.21 um 16:00 schrieb Michael McCandless:
>
> Hi Michael,
>
> I'll let others closer to the ongoing vectors developments answer with
> more technical detail, but one high level answer:
>
> The new ANN/kNN feature is not yet released! It will be in Lucene 9.0.0,
> when that is released (maybe soonish -- there is a running thread about
> remaining 9.0 blockers <https://markmail.org/thread/sga3dzbymnn5qdx4>).
> This is a major new feature, requiring a new Codec component, and will not
> be backported to Lucene 8.x.
>
> Also, we already benchmark indexing/searching of vectors every night in Lucene's
> nightly benchmarks <https://home.apache.org/~mikemccand/lucenebench/> (hmm,
> they haven't run for the past few nights ... I'll try to fix).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, May 23, 2021 at 7:39 AM Michael Wechner <michael.wechner@wyona.com>
> wrote:
>
>> Hi
>>
>> If I understand correctly the source version of Lucene contains an HNSW
>> implementation for k-nn search
>>
>> https://issues.apache.org/jira/browse/LUCENE-9004
>>
>> and another algorithm based on coarse quantization is in development
>>
>> https://issues.apache.org/jira/browse/LUCENE-9322
>>
>> Also there are various efforts to benchmark the Lucene HNSW
>> implementation according to https://github.com/erikbern/ann-benchmarks
>>
>> https://www.mail-archive.com/issues@lucene.apache.org/msg42711.html
>> https://github.com/jtibshirani/lucene/pull/1
>>
>> and
>>
>> https://github.com/alexklibisz/ann-benchmarks-lucene
>>
>> But what is not so clear to me is what is the current status of these
>> developments and what are the next steps planned?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>