Mailing List Archive

Vector Search on Lucene
Hi all,

I'm willing to use Vector Search with Lucene.

I have vectors created for queries and documents outside Lucene.
I would like to upload the document vectors to a Lucene index, Then use
Lucene to filter the documents (like classical search) and rank the
remaining products with the Vectors.

For performance reasons I would like some fast KNN for the rankers.

I looked on Google and I didn't find any document with some code samples.

2 questions:
* Is this a correct design pattern?
* Is there a good article explaining how to do this with Lucene?

Best Regards
Marcos Rebelo

--

*Marcos Bruno Gomes Rebelo Engineering Manager / Data Scientist / Software
Engineer*
Linkedin: https://www.linkedin.com/in/oleber/
*Adding value to your data. Specialized in Search and Recommendation
Systems*
Technologies: Elastic, Spark, Scala, Jupiter Notebook, Python, ...
Re: Vector Search on Lucene [ In reply to ]
Hi Marcos

The indexing looks kind of

Document doc =new Document();
float[] vector = getEmbedding(text);
FieldType vectorFieldType = KnnVectorField.createFieldType(vector.length, VectorSimilarityFunction.COSINE);
KnnVectorField vectorField =new KnnVectorField("my_vector_field", vector, vectorFieldType);
doc.add(vectorField);
writer.addDocument(doc);


And the searching / retrieval looks kind of

float[] queryVector = getEmbedding(question)
int k =7;// INFO: The number of documents to find
Query query =new KnnVectorQuery("my_vector_field", queryVector, k);
IndexSearcher searcher =new IndexSearcher(indexReader);
TopDocs topDocs = searcher.search(query, k);

Also see

https://lucene.apache.org/core/9_5_0/demo/index.html#Embeddings
https://lucene.apache.org/core/9_5_0/demo/org/apache/lucene/demo/knn/package-summary.html

HTH

Michael





Am 02.03.23 um 10:25 schrieb marcos rebelo:
> Hi all,
>
> I'm willing to use Vector Search with Lucene.
>
> I have vectors created for queries and documents outside Lucene.
> I would like to upload the document vectors to a Lucene index, Then use
> Lucene to filter the documents (like classical search) and rank the
> remaining products with the Vectors.
>
> For performance reasons I would like some fast KNN for the rankers.
>
> I looked on Google and I didn't find any document with some code samples.
>
> 2 questions:
> * Is this a correct design pattern?
> * Is there a good article explaining how to do this with Lucene?
>
> Best Regards
> Marcos Rebelo
>
Re: Vector Search on Lucene [ In reply to ]
Note that Lucene's demo package (IndexFiles.java, SearchFiles.java) also
show examples of how to index and search KNN vectors.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Mar 2, 2023 at 4:46?AM Michael Wechner <michael.wechner@wyona.com>
wrote:

> Hi Marcos
>
> The indexing looks kind of
>
> Document doc =new Document();
> float[] vector = getEmbedding(text);
> FieldType vectorFieldType = KnnVectorField.createFieldType(vector.length,
> VectorSimilarityFunction.COSINE);
> KnnVectorField vectorField =new KnnVectorField("my_vector_field", vector,
> vectorFieldType);
> doc.add(vectorField);
> writer.addDocument(doc);
>
>
> And the searching / retrieval looks kind of
>
> float[] queryVector = getEmbedding(question)
> int k =7;// INFO: The number of documents to find
> Query query =new KnnVectorQuery("my_vector_field", queryVector, k);
> IndexSearcher searcher =new IndexSearcher(indexReader);
> TopDocs topDocs = searcher.search(query, k);
>
> Also see
>
> https://lucene.apache.org/core/9_5_0/demo/index.html#Embeddings
>
> https://lucene.apache.org/core/9_5_0/demo/org/apache/lucene/demo/knn/package-summary.html
>
> HTH
>
> Michael
>
>
>
>
>
> Am 02.03.23 um 10:25 schrieb marcos rebelo:
> > Hi all,
> >
> > I'm willing to use Vector Search with Lucene.
> >
> > I have vectors created for queries and documents outside Lucene.
> > I would like to upload the document vectors to a Lucene index, Then use
> > Lucene to filter the documents (like classical search) and rank the
> > remaining products with the Vectors.
> >
> > For performance reasons I would like some fast KNN for the rankers.
> >
> > I looked on Google and I didn't find any document with some code samples.
> >
> > 2 questions:
> > * Is this a correct design pattern?
> > * Is there a good article explaining how to do this with Lucene?
> >
> > Best Regards
> > Marcos Rebelo
> >
>