I've got a searching problem which I know lots of other people have run
across too. We've got documents which have keywords (which we extract and
put into a 'keywords' field) and also have body text (which we put in a
'body' field.)
Lets say we search for "text retrieval". We want to find documents that
have "text retrieval" in the body OR in the keywords, but we want to weight
hits on the keywords more heavily. I can't boost the tokens in the index
base, so I have to do that through the query.
If I convert a query for phrase Q into this:
body:Q OR keywords:Q^n
does that do what I want?
How should I select the boost factor N? Are there negative consequences to
this strategy? Am I better off doing two queries and merging the results
myself?
--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032
http://www.quiotix.com
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
across too. We've got documents which have keywords (which we extract and
put into a 'keywords' field) and also have body text (which we put in a
'body' field.)
Lets say we search for "text retrieval". We want to find documents that
have "text retrieval" in the body OR in the keywords, but we want to weight
hits on the keywords more heavily. I can't boost the tokens in the index
base, so I have to do that through the query.
If I convert a query for phrase Q into this:
body:Q OR keywords:Q^n
does that do what I want?
How should I select the boost factor N? Are there negative consequences to
this strategy? Am I better off doing two queries and merging the results
myself?
--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032
http://www.quiotix.com
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>