Mailing List Archive: Limiting a document's score?

Limiting a document's score?

Jan 16, 2008, 5:26 PM

Post #1 of 4 (1723 views)

I've been working on fine-tuning our Lucene implementation to provide for
better search results.

I'm looking for a way to limit the score of a fuzzy or sloppy search so that
it never returns a perfect score. Is this possible? I sometimes manually
combine multiple searches into one result set but then I get a perfect score
from a fuzzy search next to a perfect score from a basic search.

Is there a way to limit the score or a better way to combine multiple
searches so that fuzzy matches don't give me a perfect score? By limit the
score, I don't mean I want to cut off a certain set of the results, but I'd
rather take the top 50% of the scores and turn a 100% relevance score into a
50% relevance or something.
--
View this message in context: http://www.nabble.com/Limiting-a-document%27s-score--tp14903523p14903523.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Limiting a document's score? [ In reply to ]

hossman_lucene at fucit

Jan 22, 2008, 5:33 PM

Post #2 of 4 (1627 views)

Permalink

: I've been working on fine-tuning our Lucene implementation to provide for
: better search results.
:
: I'm looking for a way to limit the score of a fuzzy or sloppy search so that
: it never returns a perfect score. Is this possible? I sometimes manually

this sounds like a question more applicable to the java-user mailing list
That is: assume you are using the Lucene Java API in your application.
if you asking about doing Fuzzy queries when using Nutch, or Solr the
basic info is still the same: asking your question on the appropriate user
list will reach a larger community of people with more specific info about
the context of your question.

Assuming you are using the Java API, then the Similarity.sloppyFreq method
is probably where you want to start looking.

-Hoss

Re: Limiting a document's score? [ In reply to ]

mason at jambase

Jan 22, 2008, 6:25 PM

Post #3 of 4 (1622 views)

Permalink

I'm actually not using Java, I'm using the .NET port of Lucene, Lucene.Net.
However, there are no forums and barely any support for the .NET port of
Lucene. The framework and functionality is the same. The same properties,
methods and everything.
I'll look into the Similarity.sloppyFreq, but at first glance I'm not sure
this is what I'm looking for.

Maybe there is a better way for me to merge the two searches without getting
100% fuzzy matches sitting next to 100% basic search matches? This is
essentially what I want to do. I want to show fuzzy matches but the fuzzy
matched scores should be normalized based on the basic searches so the
relevancy of all results (no matter which search they came from) are in a
order.

Hope this helps provide some insight into my issues.

hossman wrote:
>
> this sounds like a question more applicable to the java-user mailing list
> That is: assume you are using the Lucene Java API in your application.
> if you asking about doing Fuzzy queries when using Nutch, or Solr the
> basic info is still the same: asking your question on the appropriate user
> list will reach a larger community of people with more specific info about
> the context of your question.
>
> Assuming you are using the Java API, then the Similarity.sloppyFreq method
> is probably where you want to start looking.
>
>
> -Hoss
>
>
>

--
View this message in context: http://www.nabble.com/Limiting-a-document%27s-score--tp14903523p15033640.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Limiting a document's score? [ In reply to ]

hossman_lucene at fucit

Jan 22, 2008, 7:17 PM

Post #4 of 4 (1636 views)

Permalink

: I'm actually not using Java, I'm using the .NET port of Lucene, Lucene.Net.
: However, there are no forums and barely any support for the .NET port of

There are no official "forums" for Lucene Java either, but if you are
refering to mailing lists then your statement is patently false: there is
in fact a lucene-net-user@incubator.apache.org mailing list. As for
support .... support for all Apache projects comes from the community, if
no one using Lucene.Net participates in the Lucene.Net community by
posting their questions/answers on the Lucene.Net mailing list because
three is "barely any support", then your right -- there never will be
as long as people have that attitude.

: I'll look into the Similarity.sloppyFreq, but at first glance I'm not sure
: this is what I'm looking for.

it's used for both FuzzyQueries and phrase queries ... you may need to
subclass FuzzyQuery and override the getSimilarity method if you only want
one.

: Maybe there is a better way for me to merge the two searches without getting
: 100% fuzzy matches sitting next to 100% basic search matches? This is

there is no way to merge scores from seperate queries ... perhaps what you
really want to do is combine your FuzzyQuery with your exact queries
(using a BooleanQuery) and execute one search ... then your scores will
all be relative.

-Hoss