Mailing List Archive

How to ignore a match if a given keyword is before/after another given keyword?
Hi all,

Does someone know if it's possible to search documents containing a given
keyword only if this keyword is not followed or preceded or another given
keyword?

Thanks,
Jean
Re: How to ignore a match if a given keyword is before/after another given keyword? [ In reply to ]
maybe you want (abstractly):

bool(must(term("f", "positive"), mustNot(phrase("f", "negative positive",
slop=1)))

On Thu, Apr 15, 2021 at 7:27 AM Jean Morissette <jean.morissette@gmail.com>
wrote:

> Hi all,
>
> Does someone know if it's possible to search documents containing a given
> keyword only if this keyword is not followed or preceded or another given
> keyword?
>
> Thanks,
> Jean
>


--
Aditya
Re: How to ignore a match if a given keyword is before/after another given keyword? [ In reply to ]
Thank you for your answer.

The problem with this solution is that it excludes documents which contain
both positive and negative positive matches.

For example, consider those 3 documents with the terms a, b:
- document 1: "a"
- document 2: "a b"
- document 3: "a b a"

What we want is to find documents with the terms 'a', ignoring matches if
'a' is followed by 'b'.
That is, we don't want to exclude one document if 'a' is followed by 'b'.

The right answer should be documents 1 and 3 but your solution excludes
document 3.

Is-it something achievable with Lucene?

Thanks,
Jean


On Thu, 15 Apr 2021 at 01:33, Aditya Varun Chadha <adichad@gmail.com> wrote:

> maybe you want (abstractly):
>
> bool(must(term("f", "positive"), mustNot(phrase("f", "negative positive",
> slop=1)))
>
> On Thu, Apr 15, 2021 at 7:27 AM Jean Morissette <jean.morissette@gmail.com
> >
> wrote:
>
> > Hi all,
> >
> > Does someone know if it's possible to search documents containing a given
> > keyword only if this keyword is not followed or preceded or another given
> > keyword?
> >
> > Thanks,
> > Jean
> >
>
>
> --
> Aditya
>
Re: How to ignore a match if a given keyword is before/after another given keyword? [ In reply to ]
Hi Jean,

You should be able to do this with intervals, see
https://lucene.apache.org/core/8_8_1/queries/org/apache/lucene/queries/intervals/package-summary.html
.

Le dim. 25 avr. 2021 à 18:43, Jean Morissette <jean.morissette@gmail.com> a
écrit :

> Thank you for your answer.
>
> The problem with this solution is that it excludes documents which contain
> both positive and negative positive matches.
>
> For example, consider those 3 documents with the terms a, b:
> - document 1: "a"
> - document 2: "a b"
> - document 3: "a b a"
>
> What we want is to find documents with the terms 'a', ignoring matches if
> 'a' is followed by 'b'.
> That is, we don't want to exclude one document if 'a' is followed by 'b'.
>
> The right answer should be documents 1 and 3 but your solution excludes
> document 3.
>
> Is-it something achievable with Lucene?
>
> Thanks,
> Jean
>
>
> On Thu, 15 Apr 2021 at 01:33, Aditya Varun Chadha <adichad@gmail.com>
> wrote:
>
> > maybe you want (abstractly):
> >
> > bool(must(term("f", "positive"), mustNot(phrase("f", "negative positive",
> > slop=1)))
> >
> > On Thu, Apr 15, 2021 at 7:27 AM Jean Morissette <
> jean.morissette@gmail.com
> > >
> > wrote:
> >
> > > Hi all,
> > >
> > > Does someone know if it's possible to search documents containing a
> given
> > > keyword only if this keyword is not followed or preceded or another
> given
> > > keyword?
> > >
> > > Thanks,
> > > Jean
> > >
> >
> >
> > --
> > Aditya
> >
>
Re: How to ignore a match if a given keyword is before/after another given keyword? [ In reply to ]
Using intervals worked, thank you for your help !

On Sun, 25 Apr 2021 at 13:52, Adrien Grand <jpountz@gmail.com> wrote:

> Hi Jean,
>
> You should be able to do this with intervals, see
>
> https://lucene.apache.org/core/8_8_1/queries/org/apache/lucene/queries/intervals/package-summary.html
> .
>
> Le dim. 25 avr. 2021 à 18:43, Jean Morissette <jean.morissette@gmail.com>
> a
> écrit :
>
> > Thank you for your answer.
> >
> > The problem with this solution is that it excludes documents which
> contain
> > both positive and negative positive matches.
> >
> > For example, consider those 3 documents with the terms a, b:
> > - document 1: "a"
> > - document 2: "a b"
> > - document 3: "a b a"
> >
> > What we want is to find documents with the terms 'a', ignoring matches if
> > 'a' is followed by 'b'.
> > That is, we don't want to exclude one document if 'a' is followed by 'b'.
> >
> > The right answer should be documents 1 and 3 but your solution excludes
> > document 3.
> >
> > Is-it something achievable with Lucene?
> >
> > Thanks,
> > Jean
> >
> >
> > On Thu, 15 Apr 2021 at 01:33, Aditya Varun Chadha <adichad@gmail.com>
> > wrote:
> >
> > > maybe you want (abstractly):
> > >
> > > bool(must(term("f", "positive"), mustNot(phrase("f", "negative
> positive",
> > > slop=1)))
> > >
> > > On Thu, Apr 15, 2021 at 7:27 AM Jean Morissette <
> > jean.morissette@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Does someone know if it's possible to search documents containing a
> > given
> > > > keyword only if this keyword is not followed or preceded or another
> > given
> > > > keyword?
> > > >
> > > > Thanks,
> > > > Jean
> > > >
> > >
> > >
> > > --
> > > Aditya
> > >
> >
>
Re: How to ignore a match if a given keyword is before/after another given keyword? [ In reply to ]
Great to hear!

Le mar. 27 avr. 2021 à 22:44, Jean Morissette <jean.morissette@gmail.com> a
écrit :

> Using intervals worked, thank you for your help !
>
> On Sun, 25 Apr 2021 at 13:52, Adrien Grand <jpountz@gmail.com> wrote:
>
> > Hi Jean,
> >
> > You should be able to do this with intervals, see
> >
> >
> https://lucene.apache.org/core/8_8_1/queries/org/apache/lucene/queries/intervals/package-summary.html
> > .
> >
> > Le dim. 25 avr. 2021 à 18:43, Jean Morissette <jean.morissette@gmail.com
> >
> > a
> > écrit :
> >
> > > Thank you for your answer.
> > >
> > > The problem with this solution is that it excludes documents which
> > contain
> > > both positive and negative positive matches.
> > >
> > > For example, consider those 3 documents with the terms a, b:
> > > - document 1: "a"
> > > - document 2: "a b"
> > > - document 3: "a b a"
> > >
> > > What we want is to find documents with the terms 'a', ignoring matches
> if
> > > 'a' is followed by 'b'.
> > > That is, we don't want to exclude one document if 'a' is followed by
> 'b'.
> > >
> > > The right answer should be documents 1 and 3 but your solution excludes
> > > document 3.
> > >
> > > Is-it something achievable with Lucene?
> > >
> > > Thanks,
> > > Jean
> > >
> > >
> > > On Thu, 15 Apr 2021 at 01:33, Aditya Varun Chadha <adichad@gmail.com>
> > > wrote:
> > >
> > > > maybe you want (abstractly):
> > > >
> > > > bool(must(term("f", "positive"), mustNot(phrase("f", "negative
> > positive",
> > > > slop=1)))
> > > >
> > > > On Thu, Apr 15, 2021 at 7:27 AM Jean Morissette <
> > > jean.morissette@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Does someone know if it's possible to search documents containing a
> > > given
> > > > > keyword only if this keyword is not followed or preceded or another
> > > given
> > > > > keyword?
> > > > >
> > > > > Thanks,
> > > > > Jean
> > > > >
> > > >
> > > >
> > > > --
> > > > Aditya
> > > >
> > >
> >
>