Mailing List Archive: Highlighting query results, my method is too crude, but how to improve it?

Highlighting query results, my method is too crude, but how to improve it?

Feb 20, 2023, 6:07 AM

Post #1 of 5 (304 views)

Sorry I apologize for this being a bit long and for explaining the problem
at the very bottom after all the background, rather than starting with it at
the top. I thought it was easier to explain like this, please bear with me!

So I've indexed a library of technical documentation, and the index has
stored several fields per document: category, volume, title, text, etc.
Title and text are tokenised and stored, all other fields are just indexed.

When searching the index I am using the standard queryparser, and a typical
query might look like

"(title:graph AND title:axis) OR (text:graph AND text:axis)"

Because indexing includes synonym matching, I need the search to identify
matched terms in the content, e.g. in the above "graph" and "chart" are
synonyms, and "axis" and "axes" are as well.

So my search method executes the query to get a set of matching documents,
and uses the highlighter methods to identify the matches in the content:

private void doSearch( IndexReader reader, IndexSearcher searcher, Query
query, int max, FileWriter, writer, FileWriter matchlist ) {

SimpleHTMLFormatter htmlFormatter = newSimpleHTMLFormatter( hlPre,
hlPost ); // hlPre="\001"; hlPost="\002";

Highlighter highlighter = new Highlighter( htmlFormatter, new
QueryScorer( query ));

TopDocs results = searcher.search( query, max );

ScoreDoc[] hits = results.scoreDocs;

int numTotalHits = Math.toIntExact( results.totalHits.value );

HashSet<String> matchedWords = new HashSet<String>();

int start = 0;

int end = Math.min( numTotalHits, max );

for (int i = start; I < end; i++) {

Document doc = searcher.doc( hits[i].doc );

String text = doc.get( "text" );

try {

TokenStream tokens = TokenSources.getTokenStream( "text", null,
text, analyzer, -1 );

TextFragment[] frag = highlighter.getBestTextFragments( tokens,
text, true, 100 );

for ( int j = 0; j < frag.length; j++) {

if (( frag[j] != null ) && ( frag[j].getScore() > 0 )) {

addMatchedTerms( matchedWords, frag[j].toString() );

}

}

} catch .{

}

writer.write( doc.get("id") + "\n" );

}

for ( String word : matchedWords ) {

matchlist.write( word.toString() + "\n" );

}

}

There's more of course but that's the guts of it; I haven't shown the
analyzer or the method which extracts the delimited words from the fragment
and adds them to the matchedWords hashset.

In the simple example shown this works fine, and the matched words include
graph and axis and any other synonyms found in the selected documents.

The problem occurs when I use the query to filter the search by category or
by volume. I'm doing this by adding extra conditions to the query, e.g.

"(category:note AND volume:extra) AND ((title:graph AND title:axis) OR
(text:graph AND text:axis))"

When we do this the search correctly returns only documents in the selected
category/volume, but unfortunately the highlighter.getBestTextFragments()
method marks all the occurrences of "note" and "extra" in the content too.
This we don't want.

I can't see how to separate that part of the query out in the highlighter
methods, and I wonder what best practice would be here. I'm probably being
naive in using a single query for the whole job. Do I need to run a query
for category/volume, and then a subquery on text and title, and just use the
subquery in the highlighter? If that's the approach, is there a nice simple
explanation somewhere you could point me to? Because I'm a simple user who
has never done anything beyond using the simple QueryParser for everything.

cheers

T

Re: Highlighting query results, my method is too crude, but how to improve it? [ In reply to ]

mkhl at apache

Feb 20, 2023, 6:21 AM

Post #2 of 5 (304 views)

Permalink

Hello,
Maybe I'm missing some point. But, can you highlight another query than one
you search for?

On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls <trevor@castingthevoid.com>
wrote:

> Sorry I apologize for this being a bit long and for explaining the problem
> at the very bottom after all the background, rather than starting with it
> at
> the top. I thought it was easier to explain like this, please bear with me!
>
>
>
> So I've indexed a library of technical documentation, and the index has
> stored several fields per document: category, volume, title, text, etc.
> Title and text are tokenised and stored, all other fields are just indexed.
>
>
>
> When searching the index I am using the standard queryparser, and a typical
> query might look like
>
>
>
> "(title:graph AND title:axis) OR (text:graph AND text:axis)"
>
>
>
> Because indexing includes synonym matching, I need the search to identify
> matched terms in the content, e.g. in the above "graph" and "chart" are
> synonyms, and "axis" and "axes" are as well.
>
>
>
> So my search method executes the query to get a set of matching documents,
> and uses the highlighter methods to identify the matches in the content:
>
>
>
> private void doSearch( IndexReader reader, IndexSearcher searcher, Query
> query, int max, FileWriter, writer, FileWriter matchlist ) {
>
>
>
> SimpleHTMLFormatter htmlFormatter = newSimpleHTMLFormatter( hlPre,
> hlPost ); // hlPre="\001"; hlPost="\002";
>
> Highlighter highlighter = new Highlighter( htmlFormatter, new
> QueryScorer( query ));
>
>
>
> TopDocs results = searcher.search( query, max );
>
> ScoreDoc[] hits = results.scoreDocs;
>
> int numTotalHits = Math.toIntExact( results.totalHits.value );
>
>
>
> HashSet<String> matchedWords = new HashSet<String>();
>
> int start = 0;
>
> int end = Math.min( numTotalHits, max );
>
>
>
> for (int i = start; I < end; i++) {
>
> Document doc = searcher.doc( hits[i].doc );
>
> String text = doc.get( "text" );
>
> try {
>
> TokenStream tokens = TokenSources.getTokenStream( "text", null,
> text, analyzer, -1 );
>
> TextFragment[] frag = highlighter.getBestTextFragments( tokens,
> text, true, 100 );
>
> for ( int j = 0; j < frag.length; j++) {
>
> if (( frag[j] != null ) && ( frag[j].getScore() > 0 )) {
>
> addMatchedTerms( matchedWords, frag[j].toString() );
>
> }
>
> }
>
> } catch .{
>
> }
>
> writer.write( doc.get("id") + "\n" );
>
> }
>
> for ( String word : matchedWords ) {
>
> matchlist.write( word.toString() + "\n" );
>
> }
>
> }
>
>
>
> There's more of course but that's the guts of it; I haven't shown the
> analyzer or the method which extracts the delimited words from the fragment
> and adds them to the matchedWords hashset.
>
>
>
> In the simple example shown this works fine, and the matched words include
> graph and axis and any other synonyms found in the selected documents.
>
>
>
> The problem occurs when I use the query to filter the search by category or
> by volume. I'm doing this by adding extra conditions to the query, e.g.
>
>
>
> "(category:note AND volume:extra) AND ((title:graph AND title:axis) OR
> (text:graph AND text:axis))"
>
>
>
> When we do this the search correctly returns only documents in the selected
> category/volume, but unfortunately the highlighter.getBestTextFragments()
> method marks all the occurrences of "note" and "extra" in the content too.
> This we don't want.
>
> I can't see how to separate that part of the query out in the highlighter
> methods, and I wonder what best practice would be here. I'm probably being
> naive in using a single query for the whole job. Do I need to run a query
> for category/volume, and then a subquery on text and title, and just use
> the
> subquery in the highlighter? If that's the approach, is there a nice simple
> explanation somewhere you could point me to? Because I'm a simple user who
> has never done anything beyond using the simple QueryParser for everything.
>
>
>
> cheers
>
> T
>
>
>
>
>
>
>
>

--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!

RE: Highlighting query results, my method is too crude, but how to improve it? [ In reply to ]

trevor at castingthevoid

Feb 20, 2023, 7:20 AM

Post #3 of 5 (304 views)

Permalink

Well I don't know; I suppose that's part of my question.

It's not immediately obvious to me that the "query" in these two lines:

Highlighter highlighter = new Highlighter( htmlFormatter, new QueryScorer( query ));
TopDocs results = searcher.search( query, max );

has to be the same. Maybe you can use the full query in searcher.search() to select your documents, and use the text-only query in Highlighter(htmlFormatter, new QueryScorer(query)) to only find the matching terms in the selected records text fields.

Can you? If so, this is not as difficult as I thought. But I might be missing something.

cheers
T

-----Original Message-----
From: Mikhail Khludnev <mkhl@apache.org>
Sent: Tuesday, February 21, 2023 3:22 AM
To: java-user@lucene.apache.org
Subject: Re: Highlighting query results, my method is too crude, but how to improve it?

Hello,
Maybe I'm missing some point. But, can you highlight another query than one you search for?

On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls <trevor@castingthevoid.com>
wrote:

> Sorry I apologize for this being a bit long and for explaining the
> problem at the very bottom after all the background, rather than
> starting with it at the top. I thought it was easier to explain like
> this, please bear with me!
>
>
>
> So I've indexed a library of technical documentation, and the index
> has stored several fields per document: category, volume, title, text, etc.
> Title and text are tokenised and stored, all other fields are just indexed.
>
>
>
> When searching the index I am using the standard queryparser, and a
> typical query might look like
>
>
>
> "(title:graph AND title:axis) OR (text:graph AND text:axis)"
>
>
>
> Because indexing includes synonym matching, I need the search to
> identify matched terms in the content, e.g. in the above "graph" and
> "chart" are synonyms, and "axis" and "axes" are as well.
>
>
>
> So my search method executes the query to get a set of matching
> documents, and uses the highlighter methods to identify the matches in the content:
>
>
>
> private void doSearch( IndexReader reader, IndexSearcher searcher,
> Query query, int max, FileWriter, writer, FileWriter matchlist ) {
>
>
>
> SimpleHTMLFormatter htmlFormatter = newSimpleHTMLFormatter( hlPre,
> hlPost ); // hlPre="\001"; hlPost="\002";
>
> Highlighter highlighter = new Highlighter( htmlFormatter, new
> QueryScorer( query ));
>
>
>
> TopDocs results = searcher.search( query, max );
>
> ScoreDoc[] hits = results.scoreDocs;
>
> int numTotalHits = Math.toIntExact( results.totalHits.value );
>
>
>
> HashSet<String> matchedWords = new HashSet<String>();
>
> int start = 0;
>
> int end = Math.min( numTotalHits, max );
>
>
>
> for (int i = start; I < end; i++) {
>
> Document doc = searcher.doc( hits[i].doc );
>
> String text = doc.get( "text" );
>
> try {
>
> TokenStream tokens = TokenSources.getTokenStream( "text",
> null, text, analyzer, -1 );
>
> TextFragment[] frag = highlighter.getBestTextFragments(
> tokens, text, true, 100 );
>
> for ( int j = 0; j < frag.length; j++) {
>
> if (( frag[j] != null ) && ( frag[j].getScore() > 0 )) {
>
> addMatchedTerms( matchedWords, frag[j].toString() );
>
> }
>
> }
>
> } catch .{
>
> }
>
> writer.write( doc.get("id") + "\n" );
>
> }
>
> for ( String word : matchedWords ) {
>
> matchlist.write( word.toString() + "\n" );
>
> }
>
> }
>
>
>
> There's more of course but that's the guts of it; I haven't shown the
> analyzer or the method which extracts the delimited words from the
> fragment and adds them to the matchedWords hashset.
>
>
>
> In the simple example shown this works fine, and the matched words
> include graph and axis and any other synonyms found in the selected documents.
>
>
>
> The problem occurs when I use the query to filter the search by
> category or by volume. I'm doing this by adding extra conditions to the query, e.g.
>
>
>
> "(category:note AND volume:extra) AND ((title:graph AND
> title:axis) OR (text:graph AND text:axis))"
>
>
>
> When we do this the search correctly returns only documents in the
> selected category/volume, but unfortunately the
> highlighter.getBestTextFragments()
> method marks all the occurrences of "note" and "extra" in the content too.
> This we don't want.
>
> I can't see how to separate that part of the query out in the
> highlighter methods, and I wonder what best practice would be here.
> I'm probably being naive in using a single query for the whole job. Do
> I need to run a query for category/volume, and then a subquery on text
> and title, and just use the subquery in the highlighter? If that's the
> approach, is there a nice simple explanation somewhere you could point
> me to? Because I'm a simple user who has never done anything beyond
> using the simple QueryParser for everything.
>
>
>
> cheers
>
> T
>
>
>
>
>
>
>
>

--
Sincerely yours
Mikhail Khludnev
https://t.me/MUST_SEARCH
A caveat: Cyrillic!

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Highlighting query results, my method is too crude, but how to improve it? [ In reply to ]

dawid.weiss at gmail

Feb 20, 2023, 10:16 PM

Post #4 of 5 (304 views)

Permalink

You can use two different queries - the query is just used as a source of
information on what to highlight (it can even be completely different and
unrelated to the query that retrieved the documents).

Separately, unified highlighter is great but you may also try the matches
API - I found it to be a much better source of information to get accurate
highlight ranges for more complex queries (a mix of term, intervals, spans,
etc.). This test class uses a highlighter implementation that leverages
this API:

https://github.com/apache/lucene/blob/main/lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchHighlighter.java#L241-L269

If things work for you then no rush to switch over - it's yet another
option to use.

Dawid

On Mon, Feb 20, 2023 at 4:21 PM Trevor Nicholls <trevor@castingthevoid.com>
wrote:

> Well I don't know; I suppose that's part of my question.
>
> It's not immediately obvious to me that the "query" in these two lines:
>
> Highlighter highlighter = new Highlighter( htmlFormatter, new
> QueryScorer( query ));
> TopDocs results = searcher.search( query, max );
>
> has to be the same. Maybe you can use the full query in searcher.search()
> to select your documents, and use the text-only query in
> Highlighter(htmlFormatter, new QueryScorer(query)) to only find the
> matching terms in the selected records text fields.
>
> Can you? If so, this is not as difficult as I thought. But I might be
> missing something.
>
> cheers
> T
>
> -----Original Message-----
> From: Mikhail Khludnev <mkhl@apache.org>
> Sent: Tuesday, February 21, 2023 3:22 AM
> To: java-user@lucene.apache.org
> Subject: Re: Highlighting query results, my method is too crude, but how
> to improve it?
>
> Hello,
> Maybe I'm missing some point. But, can you highlight another query than
> one you search for?
>
> On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls <trevor@castingthevoid.com
> >
> wrote:
>
> > Sorry I apologize for this being a bit long and for explaining the
> > problem at the very bottom after all the background, rather than
> > starting with it at the top. I thought it was easier to explain like
> > this, please bear with me!
> >
> >
> >
> > So I've indexed a library of technical documentation, and the index
> > has stored several fields per document: category, volume, title, text,
> etc.
> > Title and text are tokenised and stored, all other fields are just
> indexed.
> >
> >
> >
> > When searching the index I am using the standard queryparser, and a
> > typical query might look like
> >
> >
> >
> > "(title:graph AND title:axis) OR (text:graph AND text:axis)"
> >
> >
> >
> > Because indexing includes synonym matching, I need the search to
> > identify matched terms in the content, e.g. in the above "graph" and
> > "chart" are synonyms, and "axis" and "axes" are as well.
> >
> >
> >
> > So my search method executes the query to get a set of matching
> > documents, and uses the highlighter methods to identify the matches in
> the content:
> >
> >
> >
> > private void doSearch( IndexReader reader, IndexSearcher searcher,
> > Query query, int max, FileWriter, writer, FileWriter matchlist ) {
> >
> >
> >
> > SimpleHTMLFormatter htmlFormatter = newSimpleHTMLFormatter( hlPre,
> > hlPost ); // hlPre="\001"; hlPost="\002";
> >
> > Highlighter highlighter = new Highlighter( htmlFormatter, new
> > QueryScorer( query ));
> >
> >
> >
> > TopDocs results = searcher.search( query, max );
> >
> > ScoreDoc[] hits = results.scoreDocs;
> >
> > int numTotalHits = Math.toIntExact( results.totalHits.value );
> >
> >
> >
> > HashSet<String> matchedWords = new HashSet<String>();
> >
> > int start = 0;
> >
> > int end = Math.min( numTotalHits, max );
> >
> >
> >
> > for (int i = start; I < end; i++) {
> >
> > Document doc = searcher.doc( hits[i].doc );
> >
> > String text = doc.get( "text" );
> >
> > try {
> >
> > TokenStream tokens = TokenSources.getTokenStream( "text",
> > null, text, analyzer, -1 );
> >
> > TextFragment[] frag = highlighter.getBestTextFragments(
> > tokens, text, true, 100 );
> >
> > for ( int j = 0; j < frag.length; j++) {
> >
> > if (( frag[j] != null ) && ( frag[j].getScore() > 0 )) {
> >
> > addMatchedTerms( matchedWords, frag[j].toString() );
> >
> > }
> >
> > }
> >
> > } catch .{
> >
> > }
> >
> > writer.write( doc.get("id") + "\n" );
> >
> > }
> >
> > for ( String word : matchedWords ) {
> >
> > matchlist.write( word.toString() + "\n" );
> >
> > }
> >
> > }
> >
> >
> >
> > There's more of course but that's the guts of it; I haven't shown the
> > analyzer or the method which extracts the delimited words from the
> > fragment and adds them to the matchedWords hashset.
> >
> >
> >
> > In the simple example shown this works fine, and the matched words
> > include graph and axis and any other synonyms found in the selected
> documents.
> >
> >
> >
> > The problem occurs when I use the query to filter the search by
> > category or by volume. I'm doing this by adding extra conditions to the
> query, e.g.
> >
> >
> >
> > "(category:note AND volume:extra) AND ((title:graph AND
> > title:axis) OR (text:graph AND text:axis))"
> >
> >
> >
> > When we do this the search correctly returns only documents in the
> > selected category/volume, but unfortunately the
> > highlighter.getBestTextFragments()
> > method marks all the occurrences of "note" and "extra" in the content
> too.
> > This we don't want.
> >
> > I can't see how to separate that part of the query out in the
> > highlighter methods, and I wonder what best practice would be here.
> > I'm probably being naive in using a single query for the whole job. Do
> > I need to run a query for category/volume, and then a subquery on text
> > and title, and just use the subquery in the highlighter? If that's the
> > approach, is there a nice simple explanation somewhere you could point
> > me to? Because I'm a simple user who has never done anything beyond
> > using the simple QueryParser for everything.
> >
> >
> >
> > cheers
> >
> > T
> >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
> https://t.me/MUST_SEARCH
> A caveat: Cyrillic!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

RE: Highlighting query results, my method is too crude, but how to improve it? [ In reply to ]

trevor at castingthevoid

Feb 21, 2023, 9:06 AM

Post #5 of 5 (303 views)

Permalink

Thank you David, very useful

cheers
T

-----Original Message-----
From: Dawid Weiss <dawid.weiss@gmail.com>
Sent: Tuesday, February 21, 2023 7:17 PM
To: java-user@lucene.apache.org
Subject: Re: Highlighting query results, my method is too crude, but how to improve it?

You can use two different queries - the query is just used as a source of information on what to highlight (it can even be completely different and unrelated to the query that retrieved the documents).

Separately, unified highlighter is great but you may also try the matches API - I found it to be a much better source of information to get accurate highlight ranges for more complex queries (a mix of term, intervals, spans, etc.). This test class uses a highlighter implementation that leverages this API:

https://github.com/apache/lucene/blob/main/lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchHighlighter.java#L241-L269

If things work for you then no rush to switch over - it's yet another option to use.

Dawid

On Mon, Feb 20, 2023 at 4:21 PM Trevor Nicholls <trevor@castingthevoid.com>
wrote:

> Well I don't know; I suppose that's part of my question.
>
> It's not immediately obvious to me that the "query" in these two lines:
>
> Highlighter highlighter = new Highlighter( htmlFormatter, new
> QueryScorer( query ));
> TopDocs results = searcher.search( query, max );
>
> has to be the same. Maybe you can use the full query in
> searcher.search() to select your documents, and use the text-only
> query in Highlighter(htmlFormatter, new QueryScorer(query)) to only
> find the matching terms in the selected records text fields.
>
> Can you? If so, this is not as difficult as I thought. But I might be
> missing something.
>
> cheers
> T
>
> -----Original Message-----
> From: Mikhail Khludnev <mkhl@apache.org>
> Sent: Tuesday, February 21, 2023 3:22 AM
> To: java-user@lucene.apache.org
> Subject: Re: Highlighting query results, my method is too crude, but
> how to improve it?
>
> Hello,
> Maybe I'm missing some point. But, can you highlight another query
> than one you search for?
>
> On Mon, Feb 20, 2023 at 5:07 PM Trevor Nicholls
> <trevor@castingthevoid.com
> >
> wrote:
>
> > Sorry I apologize for this being a bit long and for explaining the
> > problem at the very bottom after all the background, rather than
> > starting with it at the top. I thought it was easier to explain like
> > this, please bear with me!
> >
> >
> >
> > So I've indexed a library of technical documentation, and the index
> > has stored several fields per document: category, volume, title,
> > text,
> etc.
> > Title and text are tokenised and stored, all other fields are just
> indexed.
> >
> >
> >
> > When searching the index I am using the standard queryparser, and a
> > typical query might look like
> >
> >
> >
> > "(title:graph AND title:axis) OR (text:graph AND text:axis)"
> >
> >
> >
> > Because indexing includes synonym matching, I need the search to
> > identify matched terms in the content, e.g. in the above "graph" and
> > "chart" are synonyms, and "axis" and "axes" are as well.
> >
> >
> >
> > So my search method executes the query to get a set of matching
> > documents, and uses the highlighter methods to identify the matches
> > in
> the content:
> >
> >
> >
> > private void doSearch( IndexReader reader, IndexSearcher searcher,
> > Query query, int max, FileWriter, writer, FileWriter matchlist ) {
> >
> >
> >
> > SimpleHTMLFormatter htmlFormatter = newSimpleHTMLFormatter(
> > hlPre, hlPost ); // hlPre="\001"; hlPost="\002";
> >
> > Highlighter highlighter = new Highlighter( htmlFormatter, new
> > QueryScorer( query ));
> >
> >
> >
> > TopDocs results = searcher.search( query, max );
> >
> > ScoreDoc[] hits = results.scoreDocs;
> >
> > int numTotalHits = Math.toIntExact( results.totalHits.value );
> >
> >
> >
> > HashSet<String> matchedWords = new HashSet<String>();
> >
> > int start = 0;
> >
> > int end = Math.min( numTotalHits, max );
> >
> >
> >
> > for (int i = start; I < end; i++) {
> >
> > Document doc = searcher.doc( hits[i].doc );
> >
> > String text = doc.get( "text" );
> >
> > try {
> >
> > TokenStream tokens = TokenSources.getTokenStream( "text",
> > null, text, analyzer, -1 );
> >
> > TextFragment[] frag = highlighter.getBestTextFragments(
> > tokens, text, true, 100 );
> >
> > for ( int j = 0; j < frag.length; j++) {
> >
> > if (( frag[j] != null ) && ( frag[j].getScore() > 0 )) {
> >
> > addMatchedTerms( matchedWords, frag[j].toString() );
> >
> > }
> >
> > }
> >
> > } catch .{
> >
> > }
> >
> > writer.write( doc.get("id") + "\n" );
> >
> > }
> >
> > for ( String word : matchedWords ) {
> >
> > matchlist.write( word.toString() + "\n" );
> >
> > }
> >
> > }
> >
> >
> >
> > There's more of course but that's the guts of it; I haven't shown
> > the analyzer or the method which extracts the delimited words from
> > the fragment and adds them to the matchedWords hashset.
> >
> >
> >
> > In the simple example shown this works fine, and the matched words
> > include graph and axis and any other synonyms found in the selected
> documents.
> >
> >
> >
> > The problem occurs when I use the query to filter the search by
> > category or by volume. I'm doing this by adding extra conditions to
> > the
> query, e.g.
> >
> >
> >
> > "(category:note AND volume:extra) AND ((title:graph AND
> > title:axis) OR (text:graph AND text:axis))"
> >
> >
> >
> > When we do this the search correctly returns only documents in the
> > selected category/volume, but unfortunately the
> > highlighter.getBestTextFragments()
> > method marks all the occurrences of "note" and "extra" in the
> > content
> too.
> > This we don't want.
> >
> > I can't see how to separate that part of the query out in the
> > highlighter methods, and I wonder what best practice would be here.
> > I'm probably being naive in using a single query for the whole job.
> > Do I need to run a query for category/volume, and then a subquery on
> > text and title, and just use the subquery in the highlighter? If
> > that's the approach, is there a nice simple explanation somewhere
> > you could point me to? Because I'm a simple user who has never done
> > anything beyond using the simple QueryParser for everything.
> >
> >
> >
> > cheers
> >
> > T
> >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
> https://t.me/MUST_SEARCH
> A caveat: Cyrillic!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org