Mailing List Archive

How to filter KnnVectorQuery with multiple terms?
Hi

I am currently filtering a KnnVectorQuery as follows

Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD, classification));
query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);

but it is not clear to me how I can filter for multiple terms.

Should I subclass MultiTermQuery and use as filter, just as I use TermQuery as filter above?

Thanks

Michael
Re: How to filter KnnVectorQuery with multiple terms? [ In reply to ]
If I understand correctly, I believe you would want to use a TermInSetQuery
query. An example usage can be found here
https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.


You can also check out the usage of KnnVectorQuery here:
https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
noting that in this case the getPreFilter method a few lines below uses a
BooleanQuery.Builder.

As noted in TermsInSetQuery (
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
multiple terms could be represented as a boolean query with Occur.SHOULD.

~Matt

On Wed, Aug 31, 2022 at 11:15 AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi
>
> I am currently filtering a KnnVectorQuery as follows
>
> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
> classification));
> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>
> but it is not clear to me how I can filter for multiple terms.
>
> Should I subclass MultiTermQuery and use as filter, just as I use
> TermQuery as filter above?
>
> Thanks
>
> Michael
>
Re: How to filter KnnVectorQuery with multiple terms? [ In reply to ]
Hi Matt

Thanks very much for your feedback!

According to your links I will try

Collection<BytesRef> terms =new ArrayList<BytesRef>();
terms.add(new BytesRef(classification1));
terms.add(new BytesRef(classification2));
Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);

query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);

All the best

Michael



Am 31.08.22 um 20:24 schrieb Matt Davis:
> If I understand correctly, I believe you would want to use a TermInSetQuery
> query. An example usage can be found here
> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.
>
>
> You can also check out the usage of KnnVectorQuery here:
> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
> noting that in this case the getPreFilter method a few lines below uses a
> BooleanQuery.Builder.
>
> As noted in TermsInSetQuery (
> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
> multiple terms could be represented as a boolean query with Occur.SHOULD.
>
> ~Matt
>
> On Wed, Aug 31, 2022 at 11:15 AM Michael Wechner<michael.wechner@wyona.com>
> wrote:
>
>> Hi
>>
>> I am currently filtering a KnnVectorQuery as follows
>>
>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>> classification));
>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>
>> but it is not clear to me how I can filter for multiple terms.
>>
>> Should I subclass MultiTermQuery and use as filter, just as I use
>> TermQuery as filter above?
>>
>> Thanks
>>
>> Michael
>>
Re: How to filter KnnVectorQuery with multiple terms? [ In reply to ]
Simply said,

the last parameter of KnnVectorQuery is a Lucene query, so you can pass
any query type there. TermInSetQuery is a good idea for doing a "IN
multiple terms" query. But you can also pass a BooleanQuery with
multiple terms or a combination of other queries, a numeric range,... or
a fulltext query out of Lucene's query parsers.

Uwe

Am 31.08.2022 um 22:19 schrieb Michael Wechner:
> Hi Matt
>
> Thanks very much for your feedback!
>
> According to your links I will try
>
> Collection<BytesRef> terms =new ArrayList<BytesRef>();
> terms.add(new BytesRef(classification1));
> terms.add(new BytesRef(classification2));
> Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);
>
> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>
> All the best
>
> Michael
>
>
>
> Am 31.08.22 um 20:24 schrieb Matt Davis:
>> If I understand correctly, I believe you would want to use a
>> TermInSetQuery
>> query.  An example usage can be found here
>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.
>>
>>
>>
>> You can also check out the usage of KnnVectorQuery here:
>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
>>
>> noting that in this case the getPreFilter method a few lines below uses a
>> BooleanQuery.Builder.
>>
>> As noted in TermsInSetQuery (
>> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
>>
>> multiple terms could be represented as a boolean query with Occur.SHOULD.
>>
>> ~Matt
>>
>> On Wed, Aug 31, 2022 at 11:15 AM Michael
>> Wechner<michael.wechner@wyona.com>
>> wrote:
>>
>>> Hi
>>>
>>> I am currently filtering a KnnVectorQuery as follows
>>>
>>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>>> classification));
>>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>>
>>> but it is not clear to me how I can filter for multiple terms.
>>>
>>> Should I subclass MultiTermQuery and use as filter, just as I use
>>> TermQuery as filter above?
>>>
>>> Thanks
>>>
>>> Michael
>>>
>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: How to filter KnnVectorQuery with multiple terms? [ In reply to ]
great, thank you very much for clarifying!

Michael

Am 01.09.22 um 08:43 schrieb Uwe Schindler:
> Simply said,
>
> the last parameter of KnnVectorQuery is a Lucene query, so you can
> pass any query type there. TermInSetQuery is a good idea for doing a
> "IN multiple terms" query. But you can also pass a BooleanQuery with
> multiple terms or a combination of other queries, a numeric range,...
> or a fulltext query out of Lucene's query parsers.
>
> Uwe
>
> Am 31.08.2022 um 22:19 schrieb Michael Wechner:
>> Hi Matt
>>
>> Thanks very much for your feedback!
>>
>> According to your links I will try
>>
>> Collection<BytesRef> terms =new ArrayList<BytesRef>();
>> terms.add(new BytesRef(classification1));
>> terms.add(new BytesRef(classification2));
>> Query filter =new TermInSetQuery(CLASSIFICATION_FIELD, terms);
>>
>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>
>> All the best
>>
>> Michael
>>
>>
>>
>> Am 31.08.22 um 20:24 schrieb Matt Davis:
>>> If I understand correctly, I believe you would want to use a
>>> TermInSetQuery
>>> query.? An example usage can be found here
>>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L398.
>>>
>>>
>>>
>>> You can also check out the usage of KnnVectorQuery here:
>>> https://github.com/zuliaio/zuliasearch/blob/main/zulia-server/src/main/java/io/zulia/server/index/ZuliaIndex.java#L419
>>>
>>> noting that in this case the getPreFilter method a few lines below
>>> uses a
>>> BooleanQuery.Builder.
>>>
>>> As noted in TermsInSetQuery (
>>> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java#L62)
>>>
>>> multiple terms could be represented as a boolean query with
>>> Occur.SHOULD.
>>>
>>> ~Matt
>>>
>>> On Wed, Aug 31, 2022 at 11:15 AM Michael
>>> Wechner<michael.wechner@wyona.com>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> I am currently filtering a KnnVectorQuery as follows
>>>>
>>>> Query filter =new TermQuery(new Term(CLASSIFICATION_FIELD,
>>>> classification));
>>>> query =new KnnVectorQuery(VECTOR_FIELD, queryVector, k, filter);
>>>>
>>>> but it is not clear to me how I can filter for multiple terms.
>>>>
>>>> Should I subclass MultiTermQuery and use as filter, just as I use
>>>> TermQuery as filter above?
>>>>
>>>> Thanks
>>>>
>>>> Michael
>>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org