Mailing List Archive

On which field document is searched
Hi,

I am creating a full text search API and one of my requirement is to find out which exact field the input text is matched to if the document has say more than 10 fields.

Is there any way I can find out what is the most relevant field in the document against the input search text.

Thanks in advance.


Vivek Gobhil
Senior Technology Architect
Precisely.com

ATTENTION: -----
The information contained in this message (including any files transmitted with this message) may contain proprietary, trade secret or other confidential and/or legally privileged information. Any pricing information contained in this message or in any files transmitted with this message is always confidential and cannot be shared with any third parties without prior written approval from Precisely. This message is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any use, disclosure, copying or distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Precisely and destroy all copies of this message in your possession, custody or control.
Re: On which field document is searched [ In reply to ]
I guess you can setup an experiment like

search your text against each field and then look at the score but you
need to normalize the score in order to compare and

normalization will include probably length of the field etc.

Maybe there is an api in lucene for this but i dont know.

Hope this helps

Best regards


On 6/8/21 4:53 AM, Vivek Gobhil wrote:
>
> Hi,
>
> I am creating a full text search API and one of my requirement is to
> find out which exact field the input text is matched to if the
> document has say more than 10 fields.
>
> Is there any way I can find out what is the most relevant field in the
> document against the input search text.
>
> Thanks in advance.
>
> *—*
> *Vivek Gobhil*
> Senior Technology Architect
> Precisely.com
> <https://urldefense.com/v3/__http://www.precisely.com__;!!GqivPVa7Brio!M8noFNz4kp33IHMJbQ2BAd6Xge6IPzBjRWrZXArR-enGhPdwnCHhhF5su2wc6aQRGA$>
>
> <https://urldefense.com/v3/__https://www.precisely.com/__;!!GqivPVa7Brio!M8noFNz4kp33IHMJbQ2BAd6Xge6IPzBjRWrZXArR-enGhPdwnCHhhF5su2xCPXAT6Q$>
>
> ------------------------------------------------------------------------
>
> ATTENTION: -----
> The information contained in this message (including any files
> transmitted with this message) may contain proprietary, trade secret
> or other confidential and/or legally privileged information. Any
> pricing information contained in this message or in any files
> transmitted with this message is always confidential and cannot be
> shared with any third parties without prior written approval from
> Precisely. This message is intended to be read only by the individual
> or entity to whom it is addressed or by their designee. If the reader
> of this message is not the intended recipient, you are on notice that
> any use, disclosure, copying or distribution of this message, in any
> form, is strictly prohibited. If you have received this message in
> error, please immediately notify the sender and/or Precisely and
> destroy all copies of this message in your possession, custody or control.
>
Re: On which field document is searched [ In reply to ]
Hi Vivek,

I don't use Lucene directly, I mostly use Solr so I may be wrong.
But I guess you can use IndexSearcher#explain api to return the score
explanation. This API should return what all fields and how much they have
contributed to the final score. Once you get that information, you can
parse that at the client side to know which field contributed most.

Regards,
Vinay

On Tue, Jun 8, 2021 at 10:01 PM <baris.kazar@oracle.com> wrote:

> I guess you can setup an experiment like
>
> search your text against each field and then look at the score but you
> need to normalize the score in order to compare and
>
> normalization will include probably length of the field etc.
>
> Maybe there is an api in lucene for this but i dont know.
>
> Hope this helps
>
> Best regards
>
>
> On 6/8/21 4:53 AM, Vivek Gobhil wrote:
> >
> > Hi,
> >
> > I am creating a full text search API and one of my requirement is to
> > find out which exact field the input text is matched to if the
> > document has say more than 10 fields.
> >
> > Is there any way I can find out what is the most relevant field in the
> > document against the input search text.
> >
> > Thanks in advance.
> >
> > *—*
> > *Vivek Gobhil*
> > Senior Technology Architect
> > Precisely.com
> > <
> https://urldefense.com/v3/__http://www.precisely.com__;!!GqivPVa7Brio!M8noFNz4kp33IHMJbQ2BAd6Xge6IPzBjRWrZXArR-enGhPdwnCHhhF5su2wc6aQRGA$
> >
> >
> > <
> https://urldefense.com/v3/__https://www.precisely.com/__;!!GqivPVa7Brio!M8noFNz4kp33IHMJbQ2BAd6Xge6IPzBjRWrZXArR-enGhPdwnCHhhF5su2xCPXAT6Q$
> >
> >
> > ------------------------------------------------------------------------
> >
> > ATTENTION: -----
> > The information contained in this message (including any files
> > transmitted with this message) may contain proprietary, trade secret
> > or other confidential and/or legally privileged information. Any
> > pricing information contained in this message or in any files
> > transmitted with this message is always confidential and cannot be
> > shared with any third parties without prior written approval from
> > Precisely. This message is intended to be read only by the individual
> > or entity to whom it is addressed or by their designee. If the reader
> > of this message is not the intended recipient, you are on notice that
> > any use, disclosure, copying or distribution of this message, in any
> > form, is strictly prohibited. If you have received this message in
> > error, please immediately notify the sender and/or Precisely and
> > destroy all copies of this message in your possession, custody or
> control.
> >
>