Mailing List Archive

lucene ranking question
My query is a document and my index also contains the document in a
field. There maybe some formatting differences between the two
representations , but largely both versions of the document should be
fairly similar. Using the current scoring function is not giving me good
results since I am not doing a keyword to document match. What would be
a good way to match a document to a document using Lucene.

Appreciate the help.

Thanks!
Akanksha
Re: lucene ranking question [ In reply to ]
Hi Akanksha,

You would probably get more responses on such a question if posted in the
user list.

To your question, I don't see why the current results won't be good enough.
I am not sure though what exactly you mean by saying "index also contains
the document in a
field"... If the following are true:
- a document added to your index has a field named "textfield"
- that field was added to the document before the document was added to
the index
- that field has Index.TOKENIZE
- the text of that field is the same as that of the query
- the field-name and analyzer passed to the query parser are the same as
that used at indexing
then you should get the expected results.

Lucene FAQ may help you here - in particular "Why am I getting no hits /
incorrect hits?".

Regards,
Doron

Akanksha Baid <baid@cs.wisc.edu> wrote on 25/06/2007 12:19:41:

> My query is a document and my index also contains the document in a
> field. There maybe some formatting differences between the two
> representations , but largely both versions of the document should be
> fairly similar. Using the current scoring function is not giving me good
> results since I am not doing a keyword to document match. Whatwould be
> a good way to match a document to a document using Lucene.
>
> Appreciate the help.
>
> Thanks!
> Akanksha