Mailing List Archive

Proximity operators in lucene
Hi,
Does anyone know if there is a way to use a proximity operator (like
near or within) in the lucene engine?
For example, if you want to know that microsoft is within 3 words of ibm
you could use
microsoft n3 ibm

If not, how difficult is this to add the the indexing and searching
system?

Any thanks would be appreciated.

--Peter
RE: Proximity operators in lucene [ In reply to ]
> -----Original Message-----
> From: carlson@bookandhammer.com [mailto:carlson@bookandhammer.com]
> Sent: Wednesday, October 24, 2001 8:17 AM
> To: lucene-user@jakarta.apache.org
> Subject: Proximity operators in lucene
>
>
> Hi,
> Does anyone know if there is a way to use a proximity operator (like
> near or within) in the lucene engine?
> For example, if you want to know that microsoft is within 3 words of ibm
> you could use
> microsoft n3 ibm

There's a control variable called slop in PhraseQuery that performs what you
desire. By default, slop is set to 0 so that only exact phrases will match.
However, you can alter the value using the setSlop() method, eg setSlop(3)
to get the effect you desire.


Dave Kor Kian Wei
Consultant
Product Engineering
NexusEdge Technologies Pte. Ltd.
6 Aljunied Ave 3, #01-02 (Level 4)
Singapore 389932
Tel : (+65)848-2552
Fax : (+65)747-4536
Web : www.nexusedge.com
RE: Proximity operators in lucene [ In reply to ]
Thas was the slop constant is for. Look at the email below. Thats exactly
the way to do it.

-----Original Message-----
From: Dave Kor [mailto:dave.kor@nexusedge.com]
Sent: Tuesday, October 23, 2001 11:00 PM
To: carlson@bookandhammer.com; lucene-user@jakarta.apache.org
Subject: RE: Proximity operators in lucene


> -----Original Message-----
> From: carlson@bookandhammer.com [mailto:carlson@bookandhammer.com]
> Sent: Wednesday, October 24, 2001 8:17 AM
> To: lucene-user@jakarta.apache.org
> Subject: Proximity operators in lucene
>
>
> Hi,
> Does anyone know if there is a way to use a proximity operator (like
> near or within) in the lucene engine?
> For example, if you want to know that microsoft is within 3 words of ibm
> you could use
> microsoft n3 ibm

There's a control variable called slop in PhraseQuery that performs what you
desire. By default, slop is set to 0 so that only exact phrases will match.
However, you can alter the value using the setSlop() method, eg setSlop(3)
to get the effect you desire.


Dave Kor Kian Wei
Consultant
Product Engineering
NexusEdge Technologies Pte. Ltd.
6 Aljunied Ave 3, #01-02 (Level 4)
Singapore 389932
Tel : (+65)848-2552
Fax : (+65)747-4536
Web : www.nexusedge.com