Mailing List Archive

KinoSearch highlight Field-specific terms
Hi one and all

Marvin, thanks for your help to date (brilliant work and ongoing support). I've stumbled across a rather interesting problem. I've indexed my data using KS 0.162. Everything works wonderfully excepting for one
small (okay, not that small) problem.

When passing the query string "+Health +language:(English) +region:(Africa International) +skip:repeat", KS returns strings found in region (eg. Africa) as part of the bodytext category. I'd like to tell KS to search for only "Health" in the bodytext category and limit the search to where my region is Africa|International and where my language is "English" and of course skip is repeat. I know that the field-specific search are working nicely as I've been playing around with the values in skip and language and these yield the correct results but the problem remains the KS ignores the Field-specific terms being passed.

The main problem is that highlighter highlights the wrong wording even though it is a valid find (in bodytext), its the wrong category (in region)?

The sql stamement would look something like:

select * from table
where bodytext like "%Health%"
and region in (Africa International)
and language = English
and skip = repeat

Is there any work around or resolve for this?

Code ...

$case = "yes";

@fields = qw(title bodytext);

@invIndx = (
"/tmp/KinoSearchIndex/user/=foobar/index/date/=20080601/case/=$case",
"/tmp/KinoSearchIndex/user/=foobar/index/date/=20080602/case/=$case",
"/tmp/KinoSearchIndex/user/=foobar/index/date/=20080603/case/=$case",
"/tmp/KinoSearchIndex/user/=foobar/index/date/=20080604/case/=$case",
"/tmp/KinoSearchIndex/user/=foobar/index/date/=20080605/case/=$case"
);

foreach my $index (@invIndx)
{
push(@invIndxSearch, KinoSearch::Searcher->new( analyzer => $analyzer, invindex => $index))
}

$tokenizer = KinoSearch::Analysis::Tokenizer->new;
$stemmer = KinoSearch::Analysis::Stemmer->new(language => 'en');
$analyzer = KinoSearch::Analysis::PolyAnalyzer->new(analyzers => [$tokenizer, $stemmer]);

$searcher = KinoSearch::Search::MultiSearcher->new(
searchables => \@invIndxSearch,
analyzer => $analyzer
);

$query_parser = KinoSearch::QueryParser::QueryParser->new(
analyzer => $analyzer,
fields => \@fields,
);

$query = $query_parser->parse($query_string);
$hits = $searcher->search( query => $query );
$highlighter = KinoSearch::Highlight::Highlighter->new(excerpt_field => 'bodytext');
$hits->create_excerpts(highlighter => $highlighter);

Cmd: ./multifind.pl "
+Health +language:(English) +region:(Africa International) +skip:repeat"

Any assistance, would be appreciated.

Regards,
Riyaad




Re: KinoSearch highlight Field-specific terms [ In reply to ]
On Aug 18, 2008, at 6:46 AM, Riyaad Miller wrote:

> When passing the query string "+Health +language:(English) +region:
> (Africa International) +skip:repeat", KS returns strings found in
> region (eg. Africa) as part of the bodytext category. I'd like to
> tell KS to search for only "Health" in the bodytext category and
> limit the search to where my region is Africa|International and
> where my language is "English" and of course skip is repeat. I know
> that the field-specific search are working nicely as I've been
> playing around with the values in skip and language and these yield
> the correct results but the problem remains the KS ignores the Field-
> specific terms being passed.


If I understand you correctly, the problem is that the excerpt has
some words highlighted that you don't want highlighted... ?

In the devel branch you can control this because you can supply a
different Query object to the Highlighter than the one that you
supplied to the Searcher. However, that feature won't be backported
to maint.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


_______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch