Mailing List Archive

SPLADE implementation
Hi

I have found the following issue re a possible SPLADE implementation

https://github.com/apache/lucene/issues/11799

Is somebody still working on this?

Thanks

Michael



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: SPLADE implementation [ In reply to ]
Hi Michael,

What functionality are you missing? Lucene already supports
indexing/querying weighted terms using FeatureField.

On Wed, Nov 15, 2023 at 10:03?AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi
>
> I have found the following issue re a possible SPLADE implementation
>
> https://github.com/apache/lucene/issues/11799
>
> Is somebody still working on this?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

--
Adrien
Re: SPLADE implementation [ In reply to ]
Hi Adrien

Ah ok, I did not realize this, thanks for pointing this out!

I don't quite understand though, how you would implement the "SPLADE"
approach using FeatureField from the documentation at

https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html

For example when indexing a document or doing a query and I use some
language model (e.g. BERT) to do the term expansion, how
do I then make use of FeatureField exactly?

I tried to find some code examples, but couldn't, do you maybe have some
pointers?

Thanks

Michael


Am 15.11.23 um 10:34 schrieb Adrien Grand:
> Hi Michael,
>
> What functionality are you missing? Lucene already supports
> indexing/querying weighted terms using FeatureField.
>
> On Wed, Nov 15, 2023 at 10:03?AM Michael Wechner
> <michael.wechner@wyona.com> wrote:
>
> Hi
>
> I have found the following issue re a possible SPLADE implementation
>
> https://github.com/apache/lucene/issues/11799
>
> Is somebody still working on this?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
>
> --
> Adrien
Re: SPLADE implementation [ In reply to ]
Say your model produces a set of weighted terms:
- At index time, for each (term, weight) pair, you add a "new
FeatureField(fieldName, term, weight)` field to your document.
- At search time, for each (term, weight) pair, you add a "new
BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))" to
your BooleanQuery.

On Wed, Nov 15, 2023 at 11:08?AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi Adrien
>
> Ah ok, I did not realize this, thanks for pointing this out!
>
> I don't quite understand though, how you would implement the "SPLADE"
> approach using FeatureField from the documentation at
>
>
> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>
> For example when indexing a document or doing a query and I use some
> language model (e.g. BERT) to do the term expansion, how
> do I then make use of FeatureField exactly?
>
> I tried to find some code examples, but couldn't, do you maybe have some
> pointers?
>
> Thanks
>
> Michael
>
>
> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>
> Hi Michael,
>
> What functionality are you missing? Lucene already supports
> indexing/querying weighted terms using FeatureField.
>
> On Wed, Nov 15, 2023 at 10:03?AM Michael Wechner <
> michael.wechner@wyona.com> wrote:
>
>> Hi
>>
>> I have found the following issue re a possible SPLADE implementation
>>
>> https://github.com/apache/lucene/issues/11799
>>
>> Is somebody still working on this?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> --
> Adrien
>
>
>

--
Adrien
Re: SPLADE implementation [ In reply to ]
thank you very much, will try this :-)


Am 15.11.23 um 11:25 schrieb Adrien Grand:
> Say your model produces a set of weighted terms:
>  - At index time, for each (term, weight) pair, you add a "new
> FeatureField(fieldName, term, weight)` field to your document.
>  - At search time, for each (term, weight) pair, you add a "new
> BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))"
> to your BooleanQuery.
>
> On Wed, Nov 15, 2023 at 11:08?AM Michael Wechner
> <michael.wechner@wyona.com> wrote:
>
> Hi Adrien
>
> Ah ok, I did not realize this, thanks for pointing this out!
>
> I don't quite understand though, how you would implement the
> "SPLADE" approach using FeatureField from the documentation at
>
> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>
> For example when indexing a document or doing a query and I use
> some language model (e.g. BERT) to do the term expansion, how
> do I then make use of FeatureField exactly?
>
> I tried to find some code examples, but couldn't, do you maybe
> have some pointers?
>
> Thanks
>
> Michael
>
>
> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>> Hi Michael,
>>
>> What functionality are you missing? Lucene already supports
>> indexing/querying weighted terms using FeatureField.
>>
>> On Wed, Nov 15, 2023 at 10:03?AM Michael Wechner
>> <michael.wechner@wyona.com> wrote:
>>
>> Hi
>>
>> I have found the following issue re a possible SPLADE
>> implementation
>>
>> https://github.com/apache/lucene/issues/11799
>>
>> Is somebody still working on this?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>>
>> --
>> Adrien
>
>
>
> --
> Adrien
Re: SPLADE implementation [ In reply to ]
I got it running now :-) thanks again, whereas see the code below, which
might help others as well.

I don't quite understand the correlation between weights, scores, etc.
yet, but will try to figure out from the documentation at

https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html

Thanks

Michael

String question ="What animals live in the rainforests of Brazil?";

Query questionQuery = parser.parse(question);

List<String> features = getFeatures(question); // For example "jungle" as an alternatie to "rainforests"
if (features.size() >0) {
BooleanQuery.Builder bqb =new BooleanQuery.Builder();
bqb.add(questionQuery, BooleanClause.Occur.SHOULD);
for (String feature : features) {
// TODO: Replace hard-coded weight bqb.add(new BooleanClause(FeatureField.newLinearQuery("feature_field_name", feature,0.3F), BooleanClause.Occur.SHOULD));}
BooleanQuery termExpansionQuery = bqb.build();
log.info("Term expansion query: " + termExpansionQuery);
return termExpansionQuery;
}else {
log.info("Regular query: " + questionQuery);
return questionQuery;
}



Am 15.11.23 um 11:35 schrieb Michael Wechner:
> thank you very much, will try this :-)
>
>
> Am 15.11.23 um 11:25 schrieb Adrien Grand:
>> Say your model produces a set of weighted terms:
>>  - At index time, for each (term, weight) pair, you add a "new
>> FeatureField(fieldName, term, weight)` field to your document.
>>  - At search time, for each (term, weight) pair, you add a "new
>> BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))"
>> to your BooleanQuery.
>>
>> On Wed, Nov 15, 2023 at 11:08?AM Michael Wechner
>> <michael.wechner@wyona.com> wrote:
>>
>> Hi Adrien
>>
>> Ah ok, I did not realize this, thanks for pointing this out!
>>
>> I don't quite understand though, how you would implement the
>> "SPLADE" approach using FeatureField from the documentation at
>>
>> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>>
>> For example when indexing a document or doing a query and I use
>> some language model (e.g. BERT) to do the term expansion, how
>> do I then make use of FeatureField exactly?
>>
>> I tried to find some code examples, but couldn't, do you maybe
>> have some pointers?
>>
>> Thanks
>>
>> Michael
>>
>>
>> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>>> Hi Michael,
>>>
>>> What functionality are you missing? Lucene already supports
>>> indexing/querying weighted terms using FeatureField.
>>>
>>> On Wed, Nov 15, 2023 at 10:03?AM Michael Wechner
>>> <michael.wechner@wyona.com> wrote:
>>>
>>> Hi
>>>
>>> I have found the following issue re a possible SPLADE
>>> implementation
>>>
>>> https://github.com/apache/lucene/issues/11799
>>>
>>> Is somebody still working on this?
>>>
>>> Thanks
>>>
>>> Michael
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>>
>>> --
>>> Adrien
>>
>>
>>
>> --
>> Adrien
>