Mailing List Archive

How do I get TermPositions for a given document?
I have an IndexReader and I want to get a TermPositions obj for a given
document.
Right now it seems that it only works the other way - you can only get
TermPositions for a term, or globally for all terms.
Basically I want to know the positions of all the words in a given doc.
Is it the case that the index is not set up for this as it's really
optimized
to work the other way?

Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
missed it...

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
Spencer, Dave wrote:

>I have an IndexReader and I want to get a TermPositions obj for a given
>document.
>Right now it seems that it only works the other way - you can only get
>TermPositions for a term, or globally for all terms.
>Basically I want to know the positions of all the words in a given doc.
>Is it the case that the index is not set up for this as it's really
>optimized
>to work the other way?
>
Yes, that's correct. I've developed a set of add-on modules for doing
the revers lookup (aka TermVector support), but right now lack the time
to integrate it with the latest Lucene version and contribute it. So if
you are looking for a quick solution, there isn't one. If you are
looking to add this to Lucene, and are willing to dig deep into the file
formats, perhaps you can help bringing the code I have up to date with
the latest Lucene version.

>
>Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
>missed it...
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>
>
>




--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
I can try to do that, may be i'll need some help but... may be i don't, can you sent the code to mailing list?
thank you.

--

On Wed, 23 Oct 2002 16:16:58
Dmitry Serebrennikov wrote:
>Spencer, Dave wrote:
>
>>I have an IndexReader and I want to get a TermPositions obj for a given
>>document.
>>Right now it seems that it only works the other way - you can only get
>>TermPositions for a term, or globally for all terms.
>>Basically I want to know the positions of all the words in a given doc.
>>Is it the case that the index is not set up for this as it's really
>>optimized
>>to work the other way?
>>
>Yes, that's correct. I've developed a set of add-on modules for doing
>the revers lookup (aka TermVector support), but right now lack the time
>to integrate it with the latest Lucene version and contribute it. So if
>you are looking for a quick solution, there isn't one. If you are
>looking to add this to Lucene, and are willing to dig deep into the file
>formats, perhaps you can help bringing the code I have up to date with
>the latest Lucene version.
>
>>
>>Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
>>missed it...
>>
>>--
>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>
>>
>>
>>
>
>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>


____________________________________________________________
Get 250 full-color business cards FREE right now!
http://businesscards.lycos.com

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
The mailing list does not accept posting that large - I already tried to
do this before.
Maybe I should just upload it the sandbox somehow?

none none wrote:

> I can try to do that, may be i'll need some help but... may be i don't, can you sent the code to mailing list?
>thank you.
>
>--
>
>On Wed, 23 Oct 2002 16:16:58
> Dmitry Serebrennikov wrote:
>
>
>>Spencer, Dave wrote:
>>
>>
>>
>>>I have an IndexReader and I want to get a TermPositions obj for a given
>>>document.
>>>Right now it seems that it only works the other way - you can only get
>>>TermPositions for a term, or globally for all terms.
>>>Basically I want to know the positions of all the words in a given doc.
>>>Is it the case that the index is not set up for this as it's really
>>>optimized
>>>to work the other way?
>>>
>>>
>>>
>>Yes, that's correct. I've developed a set of add-on modules for doing
>>the revers lookup (aka TermVector support), but right now lack the time
>>to integrate it with the latest Lucene version and contribute it. So if
>>you are looking for a quick solution, there isn't one. If you are
>>looking to add this to Lucene, and are willing to dig deep into the file
>>formats, perhaps you can help bringing the code I have up to date with
>>the latest Lucene version.
>>
>>
>>
>>>Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
>>>missed it...
>>>
>>>--
>>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>--
>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>
>>
>>
>>
>
>
>____________________________________________________________
>Get 250 full-color business cards FREE right now!
>http://businesscards.lycos.com
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>
>
>




--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
for me that's fine into the sandbox,
or if it is less than 5 mb you can send to my email directly.

--

On Wed, 23 Oct 2002 16:28:32
Dmitry Serebrennikov wrote:
>The mailing list does not accept posting that large - I already tried to
>do this before.
>Maybe I should just upload it the sandbox somehow?
>
>none none wrote:
>
>> I can try to do that, may be i'll need some help but... may be i don't, can you sent the code to mailing list?
>>thank you.
>>
>>--
>>
>>On Wed, 23 Oct 2002 16:16:58
>> Dmitry Serebrennikov wrote:
>>
>>
>>>Spencer, Dave wrote:
>>>
>>>
>>>
>>>>I have an IndexReader and I want to get a TermPositions obj for a given
>>>>document.
>>>>Right now it seems that it only works the other way - you can only get
>>>>TermPositions for a term, or globally for all terms.
>>>>Basically I want to know the positions of all the words in a given doc.
>>>>Is it the case that the index is not set up for this as it's really
>>>>optimized
>>>>to work the other way?
>>>>
>>>>
>>>>
>>>Yes, that's correct. I've developed a set of add-on modules for doing
>>>the revers lookup (aka TermVector support), but right now lack the time
>>>to integrate it with the latest Lucene version and contribute it. So if
>>>you are looking for a quick solution, there isn't one. If you are
>>>looking to add this to Lucene, and are willing to dig deep into the file
>>>formats, perhaps you can help bringing the code I have up to date with
>>>the latest Lucene version.
>>>
>>>
>>>
>>>>Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
>>>>missed it...
>>>>
>>>>--
>>>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>--
>>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>>
>>>
>>
>>
>>____________________________________________________________
>>Get 250 full-color business cards FREE right now!
>>http://businesscards.lycos.com
>>
>>--
>>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>>
>>
>>
>>
>
>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>


____________________________________________________________
Get 250 full-color business cards FREE right now!
http://businesscards.lycos.com

--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
Into the sandbox area sound great.

Just add it to the contributions area in a project called TermPositions
or something more clever if you have a better name.

Let me know if you have any problems adding it as other may have time
to help out.

--Peter


On Wednesday, October 23, 2002, at 03:28 PM, Dmitry Serebrennikov wrote:

> The mailing list does not accept posting that large - I already tried
> to do this before.
> Maybe I should just upload it the sandbox somehow?
>
> none none wrote:
>
>> I can try to do that, may be i'll need some help but... may be i
>> don't, can you sent the code to mailing list?
>> thank you.
>>
>> --
>>
>> On Wed, 23 Oct 2002 16:16:58 Dmitry Serebrennikov wrote:
>>
>>> Spencer, Dave wrote:
>>>
>>>
>>>> I have an IndexReader and I want to get a TermPositions obj for a
>>>> given
>>>> document.
>>>> Right now it seems that it only works the other way - you can only
>>>> get
>>>> TermPositions for a term, or globally for all terms.
>>>> Basically I want to know the positions of all the words in a given
>>>> doc.
>>>> Is it the case that the index is not set up for this as it's really
>>>> optimized
>>>> to work the other way?
>>>>
>>>>
>>> Yes, that's correct. I've developed a set of add-on modules for
>>> doing the revers lookup (aka TermVector support), but right now lack
>>> the time to integrate it with the latest Lucene version and
>>> contribute it. So if you are looking for a quick solution, there
>>> isn't one. If you are looking to add this to Lucene, and are willing
>>> to dig deep into the file formats, perhaps you can help bringing the
>>> code I have up to date with the latest Lucene version.
>>>
>>>
>>>> Looked thru RC5 javadoc and the latest Oct 23 src. It may be that I
>>>> missed it...
>>>>
>>>> --
>>>> To unsubscribe, e-mail:
>>>> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>> For additional commands, e-mail:
>>>> <mailto:lucene-dev-help@jakarta.apache.org>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> To unsubscribe, e-mail:
>>> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>> For additional commands, e-mail:
>>> <mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>>
>>
>>
>> ____________________________________________________________
>> Get 250 full-color business cards FREE right now!
>> http://businesscards.lycos.com
>> --
>> To unsubscribe, e-mail:
>> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>> For additional commands, e-mail:
>> <mailto:lucene-dev-help@jakarta.apache.org>
>>
>>
>>
>
>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
Dmitry would need commit access to the Lucene-sandbox to add the code in, I
believe...

Regards,
Kelvin


On Wed, 23 Oct 2002 23:21:45 -0700, Peter Carlson wrote:
>Into the sandbox area sound great.
>
>Just add it to the contributions area in a project called
>TermPositions
>or something more clever if you have a better name.
>
>Let me know if you have any problems adding it as other may have
>time
>to help out.
>
>--Peter
>
>
>On Wednesday, October 23, 2002, at 03:28 PM, Dmitry Serebrennikov
>wrote:
>
>>The mailing list does not accept posting that large - I already
>>tried
>>to do this before.
>>Maybe I should just upload it the sandbox somehow?
>>
>>none none wrote:
>>
>>>I can try to do that, may be i'll need some help but... may be i
>>>don't, can you sent the code to mailing list?
>>>thank you.
>>>
>>>--
>>>
>>>On Wed, 23 Oct 2002 16:16:58 Dmitry Serebrennikov wrote:
>>>
>>>>Spencer, Dave wrote:
>>>>
>>>>
>>>>> I have an IndexReader and I want to get a TermPositions obj for
>>>>>a
>>>>> given
>>>>> document.
>>>>> Right now it seems that it only works the other way - you can
>>>>>only
>>>>> get
>>>>> TermPositions for a term, or globally for all terms.
>>>>> Basically I want to know the positions of all the words in a
>>>>>given
>>>>> doc.
>>>>> Is it the case that the index is not set up for this as it's
>>>>>really
>>>>> optimized
>>>>> to work the other way?
>>>>>
>>>>>
>>>>Yes, that's correct. I've developed a set of add-on modules for
>>>>doing the revers lookup (aka TermVector support), but right now
>>>>lack
>>>>the time to integrate it with the latest Lucene version and
>>>>contribute it. So if you are looking for a quick solution, there
>>>>isn't one. If you are looking to add this to Lucene, and are
>>>>willing
>>>>to dig deep into the file formats, perhaps you can help bringing
>>>>the
>>>>code I have up to date with the latest Lucene version.
>>>>
>>>>
>>>>> Looked thru RC5 javadoc and the latest Oct 23 src. It may be
>>>>>that I
>>>>> missed it...
>>>>>
>>>>> --
>>>>> To unsubscribe, e-mail:
>>>>> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>> For additional commands, e-mail:
>>>>> <mailto:lucene-dev-help@jakarta.apache.org>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>--
>>>>To unsubscribe, e-mail:
>>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>For additional commands, e-mail:
>>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>>
>>>>
>>>>
>>>
>>>
>>>____________________________________________________________
>>>Get 250 full-color business cards FREE right now!
>>>http://businesscards.lycos.com
>>>--
>>>To unsubscribe, e-mail:
>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>For additional commands, e-mail:
>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>>
>>
>>
>>
>>
>>--
>>To unsubscribe, e-mail:
>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail:
>><mailto:lucene-dev-help@jakarta.apache.org>
>>
>>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-
>unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-
>help@jakarta.apache.org>





--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
Which I used to have at some point, but I'm not sure it is still active.
Dmitry.

Kelvin Tan wrote:

>Dmitry would need commit access to the Lucene-sandbox to add the code in, I
>believe...
>
>Regards,
>Kelvin
>
>
>On Wed, 23 Oct 2002 23:21:45 -0700, Peter Carlson wrote:
>
>
>>Into the sandbox area sound great.
>>
>>Just add it to the contributions area in a project called
>>TermPositions
>>or something more clever if you have a better name.
>>
>>Let me know if you have any problems adding it as other may have
>>time
>>to help out.
>>
>>--Peter
>>
>>
>>On Wednesday, October 23, 2002, at 03:28 PM, Dmitry Serebrennikov
>>wrote:
>>
>>
>>
>>>The mailing list does not accept posting that large - I already
>>>tried
>>>to do this before.
>>>Maybe I should just upload it the sandbox somehow?
>>>
>>>none none wrote:
>>>
>>>
>>>
>>>>I can try to do that, may be i'll need some help but... may be i
>>>>don't, can you sent the code to mailing list?
>>>>thank you.
>>>>
>>>>--
>>>>
>>>>On Wed, 23 Oct 2002 16:16:58 Dmitry Serebrennikov wrote:
>>>>
>>>>
>>>>
>>>>>Spencer, Dave wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>I have an IndexReader and I want to get a TermPositions obj for
>>>>>>a
>>>>>>given
>>>>>>document.
>>>>>>Right now it seems that it only works the other way - you can
>>>>>>only
>>>>>>get
>>>>>>TermPositions for a term, or globally for all terms.
>>>>>>Basically I want to know the positions of all the words in a
>>>>>>given
>>>>>>doc.
>>>>>>Is it the case that the index is not set up for this as it's
>>>>>>really
>>>>>>optimized
>>>>>>to work the other way?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>Yes, that's correct. I've developed a set of add-on modules for
>>>>>doing the revers lookup (aka TermVector support), but right now
>>>>>lack
>>>>>the time to integrate it with the latest Lucene version and
>>>>>contribute it. So if you are looking for a quick solution, there
>>>>>isn't one. If you are looking to add this to Lucene, and are
>>>>>willing
>>>>>to dig deep into the file formats, perhaps you can help bringing
>>>>>the
>>>>>code I have up to date with the latest Lucene version.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Looked thru RC5 javadoc and the latest Oct 23 src. It may be
>>>>>>that I
>>>>>>missed it...
>>>>>>
>>>>>>--
>>>>>>To unsubscribe, e-mail:
>>>>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>>>For additional commands, e-mail:
>>>>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>--
>>>>>To unsubscribe, e-mail:
>>>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>>For additional commands, e-mail:
>>>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>____________________________________________________________
>>>>Get 250 full-color business cards FREE right now!
>>>>http://businesscards.lycos.com
>>>>--
>>>>To unsubscribe, e-mail:
>>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>>For additional commands, e-mail:
>>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>--
>>>To unsubscribe, e-mail:
>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>For additional commands, e-mail:
>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>>
>>>
>>--
>>To unsubscribe, e-mail: <mailto:lucene-dev-
>>unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail: <mailto:lucene-dev-
>>help@jakarta.apache.org>
>>
>>
>
>
>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>
>
>




--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: How do I get TermPositions for a given document? [ In reply to ]
Let me know if you have any problems.

All Lucene committer should be lucene-sandbox committers.

--Peter


On Thursday, October 24, 2002, at 06:35 AM, Dmitry Serebrennikov wrote:

> Which I used to have at some point, but I'm not sure it is still
> active.
> Dmitry.


--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>