Hi,
Documents which are shorter in length always seem to score higher in Lucene. I was under the impression that the nornalization factors in the scoring function used by Lucene would improve this, however, after a couple of experiments, the short documents still always score the highest.
Does anyone have any ideas as to how it is possible to make lengthier documents score higher?
Also, I would like a way to boost documents according to the amount of in-links this document has.
Has anyone implemented a type of Document.setBoost() method?
I found a thread in the lucene-dev mailinglist where Doug Cutting mentions that this would be a great feature to add to Lucene. No one followed his email.
Melissa.
Documents which are shorter in length always seem to score higher in Lucene. I was under the impression that the nornalization factors in the scoring function used by Lucene would improve this, however, after a couple of experiments, the short documents still always score the highest.
Does anyone have any ideas as to how it is possible to make lengthier documents score higher?
Also, I would like a way to boost documents according to the amount of in-links this document has.
Has anyone implemented a type of Document.setBoost() method?
I found a thread in the lucene-dev mailinglist where Doug Cutting mentions that this would be a great feature to add to Lucene. No one followed his email.
Melissa.