Hi,
I'm upgrading a project to lucene 8.5.2 which had been using 3.0.0.
Some tests are failing with a strange issue. The gist of it is, we create fields that need position and offset information. Inserting one field works ok, but then searching for the document and adding another value for the same field results in the following exception
java.lang.IllegalArgumentException: all instances of a given field name must have the same term vectors settings (storeTermVectorPositions changed for field="f1")
at org.apache.lucene.index.TermVectorsConsumerPerField.start(TermVectorsConsumerPerField.java:166)
at org.apache.lucene.index.TermsHashPerField.start(TermsHashPerField.java:294)
at org.apache.lucene.index.FreqProxTermsWriterPerField.start(FreqProxTermsWriterPerField.java:72)
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:810)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:442)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:406)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:495)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1213)
at com.profium.sir.LuceneTest.writeDoc(LuceneTest.java:66)
at com.profium.sir.LuceneTest.testLucene(LuceneTest.java:58)
This is happening even though the exact same FieldType object is being used in the field each time, and it is frozen.
I've isolated the problem to the following code snippet which reproduces it:
import java.io.IOException;
import java.nio.file.Path;
import org.apache.lucene.analysis.en.EnglishAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexOptions;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.MMapDirectory;
public class LuceneTest {
private static FieldType FIELD_TYPE = new FieldType();
static {
FIELD_TYPE.setStored(true);
FIELD_TYPE.setTokenized(true);
FIELD_TYPE.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
FIELD_TYPE.setStoreTermVectors(true);
FIELD_TYPE.setStoreTermVectorPayloads(true);
FIELD_TYPE.setStoreTermVectorPositions(true);
FIELD_TYPE.setStoreTermVectorOffsets(true);
FIELD_TYPE.freeze();
}
public static void main(String[] args) throws IOException {
testLucene();
}
public static void testLucene() throws IOException {
Document doc = new Document();
doc.add(new Field("f1", "foo", FIELD_TYPE));
writeDoc(doc);
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(getDirectory()));
doc = searcher.doc(0);
doc.add(new Field("f1", "bar", FIELD_TYPE));
writeDoc(doc);
}
private static void writeDoc(Document doc)
throws IOException {
Directory directory = getDirectory();
IndexWriterConfig conf = new IndexWriterConfig(new EnglishAnalyzer());
IndexWriter writer = new IndexWriter(directory , conf);
writer.addDocument(doc);
writer.flush();
writer.close();
}
private static Directory getDirectory() throws IOException {
return new MMapDirectory(Path.of("lucenttest"));
}
}
Experimenting shows that if the following three properties are not set on the FieldType, the exception is no longer thrown, but removing them breaks functionality we have that depends on the position and offset info.
FIELD_TYPE.setStoreTermVectorPayloads(true);
FIELD_TYPE.setStoreTermVectorPositions(true);
FIELD_TYPE.setStoreTermVectorOffsets(true);
Perhaps I'm doing something I shouldn't be, thanks in advance for any help!
Regards,
Albert
Albert MacSweeny
Profium, Lars Sonckin kaari 12, 02600 Espoo, Finland
Tel. +358 (0)9 855 98 000 Mob. +353 (0)87 664 2560
Internet: http://www.profium.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
I'm upgrading a project to lucene 8.5.2 which had been using 3.0.0.
Some tests are failing with a strange issue. The gist of it is, we create fields that need position and offset information. Inserting one field works ok, but then searching for the document and adding another value for the same field results in the following exception
java.lang.IllegalArgumentException: all instances of a given field name must have the same term vectors settings (storeTermVectorPositions changed for field="f1")
at org.apache.lucene.index.TermVectorsConsumerPerField.start(TermVectorsConsumerPerField.java:166)
at org.apache.lucene.index.TermsHashPerField.start(TermsHashPerField.java:294)
at org.apache.lucene.index.FreqProxTermsWriterPerField.start(FreqProxTermsWriterPerField.java:72)
at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:810)
at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:442)
at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:406)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:495)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1594)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1213)
at com.profium.sir.LuceneTest.writeDoc(LuceneTest.java:66)
at com.profium.sir.LuceneTest.testLucene(LuceneTest.java:58)
This is happening even though the exact same FieldType object is being used in the field each time, and it is frozen.
I've isolated the problem to the following code snippet which reproduces it:
import java.io.IOException;
import java.nio.file.Path;
import org.apache.lucene.analysis.en.EnglishAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexOptions;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.MMapDirectory;
public class LuceneTest {
private static FieldType FIELD_TYPE = new FieldType();
static {
FIELD_TYPE.setStored(true);
FIELD_TYPE.setTokenized(true);
FIELD_TYPE.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
FIELD_TYPE.setStoreTermVectors(true);
FIELD_TYPE.setStoreTermVectorPayloads(true);
FIELD_TYPE.setStoreTermVectorPositions(true);
FIELD_TYPE.setStoreTermVectorOffsets(true);
FIELD_TYPE.freeze();
}
public static void main(String[] args) throws IOException {
testLucene();
}
public static void testLucene() throws IOException {
Document doc = new Document();
doc.add(new Field("f1", "foo", FIELD_TYPE));
writeDoc(doc);
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(getDirectory()));
doc = searcher.doc(0);
doc.add(new Field("f1", "bar", FIELD_TYPE));
writeDoc(doc);
}
private static void writeDoc(Document doc)
throws IOException {
Directory directory = getDirectory();
IndexWriterConfig conf = new IndexWriterConfig(new EnglishAnalyzer());
IndexWriter writer = new IndexWriter(directory , conf);
writer.addDocument(doc);
writer.flush();
writer.close();
}
private static Directory getDirectory() throws IOException {
return new MMapDirectory(Path.of("lucenttest"));
}
}
Experimenting shows that if the following three properties are not set on the FieldType, the exception is no longer thrown, but removing them breaks functionality we have that depends on the position and offset info.
FIELD_TYPE.setStoreTermVectorPayloads(true);
FIELD_TYPE.setStoreTermVectorPositions(true);
FIELD_TYPE.setStoreTermVectorOffsets(true);
Perhaps I'm doing something I shouldn't be, thanks in advance for any help!
Regards,
Albert
Albert MacSweeny
Profium, Lars Sonckin kaari 12, 02600 Espoo, Finland
Tel. +358 (0)9 855 98 000 Mob. +353 (0)87 664 2560
Internet: http://www.profium.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org