Mailing List Archive: Multimodal search

Multimodal search

michael.wechner at wyona

Oct 12, 2023, 12:59 PM

Post #1 of 6 (100 views)

Hi

Did anyone of the Lucene committers consider making Lucene multimodal?

With a quick Google search I found for example

https://dl.acm.org/doi/abs/10.1145/3503161.3548768

https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf

Thanks

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Multimodal search [ In reply to ]

msfroh at gmail

Oct 12, 2023, 3:49 PM

Post #2 of 6 (98 views)

We recently added multimodal search in OpenSearch:
https://github.com/opensearch-project/neural-search/pull/359

Since Lucene ultimately just cares about embeddings, does Lucene itself
really need to be multimodal? Wherever the embeddings come from, Lucene can
index the vectors and combine with textual queries, right?

Thanks,
Froh

On Thu, Oct 12, 2023 at 12:59?PM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi
>
> Did anyone of the Lucene committers consider making Lucene multimodal?
>
> With a quick Google search I found for example
>
> https://dl.acm.org/doi/abs/10.1145/3503161.3548768
>
> https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Multimodal search [ In reply to ]

michael.wechner at wyona

Oct 13, 2023, 8:51 AM

Post #3 of 6 (96 views)

Thanks for your feedback and the link to the OpenSearch implementation!

I think the embedding approach as it exists today is not and will not be
able to provide good enough accuracy.
Many people try to fix this with re-ranking, which helps, but does not
really fix the actual problem.

I think we focus too much on text, because text/language is actually
just a representation of the "models" we create in our minds from the
reality we perceive via our senses.

When you take multimodality into account from the very beginning, then
you will be forced to approach search differently
and I would argue that this will lead to a much more powerful search
implementation, which is able to provide better accuracy and also the
capability that the implementation knows much better what it does not know.

I do not mean to sound philosophical, but actually have a quite clear
implementation in my mind resp. on paper, but I would be interested
to know whether the Lucene community is interested to reconsider search
from the ground up?

I think the Lucene community has a fantastic knowledge / expertise, but
I think it is time to evolve quite radically, and not just do another
vector search implementation.

WDYT?

Thanks

Michael

Am 13.10.23 um 00:49 schrieb Michael Froh:
> We recently added multimodal search in OpenSearch:
> https://github.com/opensearch-project/neural-search/pull/359
>
> Since Lucene ultimately just cares about embeddings, does Lucene
> itself really need to be multimodal? Wherever the embeddings come
> from, Lucene can index the vectors and combine with textual queries,
> right?
>
> Thanks,
> Froh
>
> On Thu, Oct 12, 2023 at 12:59?PM Michael Wechner
> <michael.wechner@wyona.com> wrote:
>
> Hi
>
> Did anyone of the Lucene committers consider making Lucene multimodal?
>
> With a quick Google search I found for example
>
> https://dl.acm.org/doi/abs/10.1145/3503161.3548768
>
> https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

Re: Multimodal search [ In reply to ]

vermanavneet003 at gmail

Oct 14, 2023, 12:38 AM

Post #4 of 6 (96 views)

Hi Michael,
Please correct me if I am wrong, I think what you are trying to say with
multimodal search is to combine both text search and vector search to
improve the accuracy of search results. As per my understanding of search
space people are coining this as Hybrid search. We recently launched a
query clause in OpenSearch called "hybrid" which takes this hybrid approach
and combines scores of text and vector search globally(
https://opensearch.org/blog/hybrid-search/). As per our experiments we saw
accuracy being better than text search and vector search alone. Just
curious if you are thinking something like this or you have a completely
different thought.

I agree that currently to improve the accuracy of search results there have
been techniques like re-ranking that are very popular.

Thanks
Navneet

On Fri, Oct 13, 2023 at 8:53?AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Thanks for your feedback and the link to the OpenSearch implementation!
>
> I think the embedding approach as it exists today is not and will not be
> able to provide good enough accuracy.
> Many people try to fix this with re-ranking, which helps, but does not
> really fix the actual problem.
>
> I think we focus too much on text, because text/language is actually just
> a representation of the "models" we create in our minds from the reality we
> perceive via our senses.
>
> When you take multimodality into account from the very beginning, then you
> will be forced to approach search differently
> and I would argue that this will lead to a much more powerful search
> implementation, which is able to provide better accuracy and also the
> capability that the implementation knows much better what it does not know.
>
> I do not mean to sound philosophical, but actually have a quite clear
> implementation in my mind resp. on paper, but I would be interested
> to know whether the Lucene community is interested to reconsider search
> from the ground up?
>
> I think the Lucene community has a fantastic knowledge / expertise, but I
> think it is time to evolve quite radically, and not just do another vector
> search implementation.
>
> WDYT?
>
> Thanks
>
> Michael
>
>
>
>
>
>
>
> Am 13.10.23 um 00:49 schrieb Michael Froh:
>
> We recently added multimodal search in OpenSearch:
> https://github.com/opensearch-project/neural-search/pull/359
>
> Since Lucene ultimately just cares about embeddings, does Lucene itself
> really need to be multimodal? Wherever the embeddings come from, Lucene can
> index the vectors and combine with textual queries, right?
>
> Thanks,
> Froh
>
> On Thu, Oct 12, 2023 at 12:59?PM Michael Wechner <
> michael.wechner@wyona.com> wrote:
>
>> Hi
>>
>> Did anyone of the Lucene committers consider making Lucene multimodal?
>>
>> With a quick Google search I found for example
>>
>> https://dl.acm.org/doi/abs/10.1145/3503161.3548768
>>
>> https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>

Re: Multimodal search [ In reply to ]

michael.wechner at wyona

Oct 15, 2023, 12:05 PM

Post #5 of 6 (95 views)

Hi Navneet

I also observe that various "vector search DBs" are implementing hybrid
search, because the accuracy with embeddings is often not good enough.
Vectors are often too "mushy" and hybrid search can help to improve
accuracy, just as re-ranking does, but I think there is a better way.

Depending on the dataset and the expertise of a human, answers by
"humans" are much more accurate, because I think "humans" are extracting
"features" from input and then operate on these "features". See for example

https://medium.com/aleph-alpha-blog/multimodality-attention-is-all-you-need-is-all-we-needed-526c45abdf0

and see the principles behind DALL-E and CLIP.

I think the same or similar principles could be re-used to implement a
more accurate search.

I have built a very simple PoC and it looks promising, that using this
approach provides a much higher accuracy, because the similarity score
is much more distinct.

Of course there are various challenges, but I think it is worth exploring.

I also understand that within an existing "ecosystem" change, resp.
trying something new can be difficult, but I guess I am not the only one
seeing low accuracy as a fundamental problem, right?

Thanks

Michael

Am 14.10.23 um 09:38 schrieb Navneet Verma:
> Hi Michael,
> Please correct me if I am wrong, I think what you are trying to say
> with multimodal search is to combine both text search and vector
> search to improve the accuracy of search results. As per my
> understanding of search space people are coining this as Hybrid
> search. We recently launched a query clause in OpenSearch called
> "hybrid" which takes this hybrid approach and combines scores of text
> and vector search
> globally(https://opensearch.org/blog/hybrid-search/). As per our
> experiments we saw accuracy being better than text search and vector
> search alone. Just curious if you are thinking something like this or
> you have a completely different thought.
>
> I agree that currently to improve the accuracy of search results there
> have been techniques like re-ranking that are very popular.
>
>
> Thanks
> Navneet
>
> On Fri, Oct 13, 2023 at 8:53?AM Michael Wechner
> <michael.wechner@wyona.com> wrote:
>
> Thanks for your feedback and the link to the OpenSearch
> implementation!
>
> I think the embedding approach as it exists today is not and will
> not be able to provide good enough accuracy.
> Many people try to fix this with re-ranking, which helps, but does
> not really fix the actual problem.
>
> I think we focus too much on text, because text/language is
> actually just a representation of the "models" we create in our
> minds from the reality we perceive via our senses.
>
> When you take multimodality into account from the very beginning,
> then you will be forced to approach search differently
> and I would argue that this will lead to a much more powerful
> search implementation, which is able to provide better accuracy
> and also the capability that the implementation knows much better
> what it does not know.
>
> I do not mean to sound philosophical, but actually have a quite
> clear implementation in my mind resp. on paper, but I would be
> interested
> to know whether the Lucene community is interested to reconsider
> search from the ground up?
>
> I think the Lucene community has a fantastic knowledge /
> expertise, but I think it is time to evolve quite radically, and
> not just do another vector search implementation.
>
> WDYT?
>
> Thanks
>
> Michael
>
>
>
>
>
>
>
> Am 13.10.23 um 00:49 schrieb Michael Froh:
>> We recently added multimodal search in OpenSearch:
>> https://github.com/opensearch-project/neural-search/pull/359
>>
>> Since Lucene ultimately just cares about embeddings, does Lucene
>> itself really need to be multimodal? Wherever the embeddings come
>> from, Lucene can index the vectors and combine with textual
>> queries, right?
>>
>> Thanks,
>> Froh
>>
>> On Thu, Oct 12, 2023 at 12:59?PM Michael Wechner
>> <michael.wechner@wyona.com> wrote:
>>
>> Hi
>>
>> Did anyone of the Lucene committers consider making Lucene
>> multimodal?
>>
>> With a quick Google search I found for example
>>
>> https://dl.acm.org/doi/abs/10.1145/3503161.3548768
>>
>> https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

Re: Multimodal search [ In reply to ]

michael.wechner at wyona

Oct 16, 2023, 1:46 AM

Post #6 of 6 (94 views)

btw, here are some other examples of hybrid search implementations,
using RRF

https://weaviate.io/blog/hybrid-search-explained
https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking
https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html

but as written below, I don't think this really addresses the problem of
accuracy at its core.

Thanks

Michael

Am 15.10.23 um 21:05 schrieb Michael Wechner:
> Hi Navneet
>
> I also observe that various "vector search DBs" are implementing
> hybrid search, because the accuracy with embeddings is often not good
> enough.
> Vectors are often too "mushy" and hybrid search can help to improve
> accuracy, just as re-ranking does, but I think there is a better way.
>
> Depending on the dataset and the expertise of a human, answers by
> "humans" are much more accurate, because I think "humans" are
> extracting "features" from input and then operate on these "features".
> See for example
>
> https://medium.com/aleph-alpha-blog/multimodality-attention-is-all-you-need-is-all-we-needed-526c45abdf0
>
> and see the principles behind DALL-E and CLIP.
>
> I think the same or similar principles could be re-used to implement a
> more accurate search.
>
> I have built a very simple PoC and it looks promising, that using this
> approach provides a much higher accuracy, because the similarity score
> is much more distinct.
>
> Of course there are various challenges, but I think it is worth exploring.
>
> I also understand that within an existing "ecosystem" change, resp.
> trying something new can be difficult, but I guess I am not the only
> one seeing low accuracy as a fundamental problem, right?
>
> Thanks
>
> Michael
>
>
>
>
>
> Am 14.10.23 um 09:38 schrieb Navneet Verma:
>> Hi Michael,
>> Please correct me if I am wrong, I think what you are trying to say
>> with multimodal search is to combine both text search and vector
>> search to improve the accuracy of search results. As per my
>> understanding of search space people are coining this as Hybrid
>> search. We recently launched a query clause in OpenSearch called
>> "hybrid" which takes this hybrid approach and combines scores of text
>> and vector search
>> globally(https://opensearch.org/blog/hybrid-search/). As per our
>> experiments we saw accuracy being better than text search and vector
>> search alone. Just curious if you are thinking something like this or
>> you have a completely different thought.
>>
>> I agree that currently to improve the accuracy of search results
>> there have been techniques like re-ranking that are very popular.
>>
>>
>> Thanks
>> Navneet
>>
>> On Fri, Oct 13, 2023 at 8:53?AM Michael Wechner
>> <michael.wechner@wyona.com> wrote:
>>
>> Thanks for your feedback and the link to the OpenSearch
>> implementation!
>>
>> I think the embedding approach as it exists today is not and will
>> not be able to provide good enough accuracy.
>> Many people try to fix this with re-ranking, which helps, but
>> does not really fix the actual problem.
>>
>> I think we focus too much on text, because text/language is
>> actually just a representation of the "models" we create in our
>> minds from the reality we perceive via our senses.
>>
>> When you take multimodality into account from the very beginning,
>> then you will be forced to approach search differently
>> and I would argue that this will lead to a much more powerful
>> search implementation, which is able to provide better accuracy
>> and also the capability that the implementation knows much better
>> what it does not know.
>>
>> I do not mean to sound philosophical, but actually have a quite
>> clear implementation in my mind resp. on paper, but I would be
>> interested
>> to know whether the Lucene community is interested to reconsider
>> search from the ground up?
>>
>> I think the Lucene community has a fantastic knowledge /
>> expertise, but I think it is time to evolve quite radically, and
>> not just do another vector search implementation.
>>
>> WDYT?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>>
>>
>>
>>
>> Am 13.10.23 um 00:49 schrieb Michael Froh:
>>> We recently added multimodal search in OpenSearch:
>>> https://github.com/opensearch-project/neural-search/pull/359
>>>
>>> Since Lucene ultimately just cares about embeddings, does Lucene
>>> itself really need to be multimodal? Wherever the embeddings
>>> come from, Lucene can index the vectors and combine with textual
>>> queries, right?
>>>
>>> Thanks,
>>> Froh
>>>
>>> On Thu, Oct 12, 2023 at 12:59?PM Michael Wechner
>>> <michael.wechner@wyona.com> wrote:
>>>
>>> Hi
>>>
>>> Did anyone of the Lucene committers consider making Lucene
>>> multimodal?
>>>
>>> With a quick Google search I found for example
>>>
>>> https://dl.acm.org/doi/abs/10.1145/3503161.3548768
>>>
>>> https://sigir-ecom.github.io/ecom2018/ecom18Papers/paper7.pdf
>>>
>>> Thanks
>>>
>>> Michael
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>