Mailing List Archive

Indexing part number
Does anybody could let me know what should be changed in the "IndexFiles"
demo to let met index and query "pure" digit part number. Currently only
alphabetic query seem to work, digit and special characters (-, _, /, ...)
are ignored.

Thanks

Jean-Marc Bertinchamps
RE: Indexing part number [ In reply to ]
Try using org.apache.lucene.analysis.standard.StandardAnalyzer instead of
StopAnalyzer. This will index numbers, etc. Have a look at
StandardTokenizer.jj in the sources for details. If that grammar is not
quite right, copy it and compile your own tokenizer.

Doug

> -----Original Message-----
> From: Jean-Marc Bertinchamps [mailto:jmbertinchamps@edpsa.com]
> Sent: Tuesday, October 02, 2001 6:22 AM
> To: lucene-user@jakarta.apache.org
> Subject: Indexing part number
>
>
> Does anybody could let me know what should be changed in the
> "IndexFiles"
> demo to let met index and query "pure" digit part number.
> Currently only
> alphabetic query seem to work, digit and special characters
> (-, _, /, ...)
> are ignored.
>
> Thanks
>
> Jean-Marc Bertinchamps
>
>
Re: Indexing part number [ In reply to ]
Hi !,

I have an incremental index and the present size is about 1.34GB. I am getting following exception for the addition of new files in the index:

java.lang.IllegalStateException: field originType cannot be an indexed field.
at com.lucene.index.FieldInfos.add(java/com/lucene/index/FieldInfos.java:72)
at com.lucene.index.FieldInfos.add(java/com/lucene/index/FieldInfos.java:63)
at com.lucene.index.SegmentMerger.mergeFields(java/com/lucene/index/SegmentMerger.java:69)
at com.lucene.index.SegmentMerger.merge(java/com/lucene/index/SegmentMerger.java:53)
at com.lucene.index.IndexWriter.mergeSegments(java/com/lucene/index/IndexWriter.java:267)
at com.lucene.index.IndexWriter.mergeSegments(java/com/lucene/index/IndexWriter.java:241)
at com.lucene.index.IndexWriter.maybeMergeSegments(java/com/lucene/index/IndexWriter.java:230)
at com.lucene.index.IndexWriter.addDocument(java/com/lucene/index/IndexWriter.java:125)
.......

Can any one suggest me as to what can be the cause ? The originType is one of the field on which I create index. Is it related to the size of the index ?

ASM
----- Original Message -----
From: Doug Cutting
To: 'jmbertinchamps@edpsa.com' ; lucene-user@jakarta.apache.org
Sent: Wednesday, October 03, 2001 4:25 AM
Subject: RE: Indexing part number


Try using org.apache.lucene.analysis.standard.StandardAnalyzer instead of
StopAnalyzer. This will index numbers, etc. Have a look at
StandardTokenizer.jj in the sources for details. If that grammar is not
quite right, copy it and compile your own tokenizer.

Doug

> -----Original Message-----
> From: Jean-Marc Bertinchamps [mailto:jmbertinchamps@edpsa.com]
> Sent: Tuesday, October 02, 2001 6:22 AM
> To: lucene-user@jakarta.apache.org
> Subject: Indexing part number
>
>
> Does anybody could let me know what should be changed in the
> "IndexFiles"
> demo to let met index and query "pure" digit part number.
> Currently only
> alphabetic query seem to work, digit and special characters
> (-, _, /, ...)
> are ignored.
>
> Thanks
>
> Jean-Marc Bertinchamps
>
>
Re: Indexing part number [ In reply to ]
I am using the StandardAnalyzer and wish to search on the numbers. But, I am getting the following error for the query on the number. My query is headline:(-() + ("109" ))(-() + ("109" )). Can any one suggest something ?

ASM
----- Original Message -----
From: Doug Cutting
To: 'jmbertinchamps@edpsa.com' ; lucene-user@jakarta.apache.org
Sent: Wednesday, October 03, 2001 4:25 AM
Subject: RE: Indexing part number


Try using org.apache.lucene.analysis.standard.StandardAnalyzer instead of
StopAnalyzer. This will index numbers, etc. Have a look at
StandardTokenizer.jj in the sources for details. If that grammar is not
quite right, copy it and compile your own tokenizer.

Doug

> -----Original Message-----
> From: Jean-Marc Bertinchamps [mailto:jmbertinchamps@edpsa.com]
> Sent: Tuesday, October 02, 2001 6:22 AM
> To: lucene-user@jakarta.apache.org
> Subject: Indexing part number
>
>
> Does anybody could let me know what should be changed in the
> "IndexFiles"
> demo to let met index and query "pure" digit part number.
> Currently only
> alphabetic query seem to work, digit and special characters
> (-, _, /, ...)
> are ignored.
>
> Thanks
>
> Jean-Marc Bertinchamps
>
>