Hello everyone,
I am experimenting with Interval Queries to get phrase match count within
parts of an indexed field.
ContainingIntervalsSource seemed to be the way to go but, it only considers
at most a single match per region.
Example:
Field value: "[a b c d e a c] e f g h [a c k]" (opening and closing square
braces are not part of the text, but shows region of the field I am
interested in)
Within the regions in the field I am trying to find all phrase match
positions for say 'a c' with slop=1
First region has 2 matches and the second region has 1.
ContainingIntervalsSource produces an iterator that produces the first
match position for the first region ([a b c d e a c]). but there are two
matches in this region. It seems this behavior is by design. Is it possible
to accomplish this with the existing interval sources or one should write a
custom one for this?
On a related note, does it make sense for ContainingIntervalsSource to
produce multiple match positions for the first segment?
Thanks,
Elbek.
I am experimenting with Interval Queries to get phrase match count within
parts of an indexed field.
ContainingIntervalsSource seemed to be the way to go but, it only considers
at most a single match per region.
Example:
Field value: "[a b c d e a c] e f g h [a c k]" (opening and closing square
braces are not part of the text, but shows region of the field I am
interested in)
Within the regions in the field I am trying to find all phrase match
positions for say 'a c' with slop=1
First region has 2 matches and the second region has 1.
ContainingIntervalsSource produces an iterator that produces the first
match position for the first region ([a b c d e a c]). but there are two
matches in this region. It seems this behavior is by design. Is it possible
to accomplish this with the existing interval sources or one should write a
custom one for this?
On a related note, does it make sense for ContainingIntervalsSource to
produce multiple match positions for the first segment?
Thanks,
Elbek.