I'm not sure if this is the cause of your problems, but when you're doing
deletions you need to close the reader before you open a writer, otherwise
deletions can be lost. You're claiming that additions are lost, but could
it really be that it is the deletions which have been lost? Try closing the
reader before you open the writer and tell me if that helps.
Doug
> -----Original Message-----
> From: Matt Read [mailto:mread@dircon.co.uk]
> Sent: Wednesday, October 24, 2001 10:35 AM
> To: lucene-user@jakarta.apache.org
> Subject: FW: Segments not merging on delete
>
>
> Hi, I'm having some problems with segments failing to merge
> when deleting
> documents from the index. I've followed the recommendations
> in the FAQ for
> avoiding a complete index rebuild when adding a new document, i.e. I'm
> deleting the document from the index and the re-adding it.
> The index however
> is growing even if I just replace the same document
> repeatedly. And although
> the reader.numDocs() method returns the correct value the
> reader.docFreq()
> for each term is increasing.
>
> I'd appreciate any help. Thanks.
>
> When I index a document, I have this code:
>
> // check if it's already there and if it is, delete it
> fsd = FSDirectory.getDirectory(indexPath, false);
> reader = IndexReader.open(fsd);
> Term t = new Term("original_path", f.getPath());
> if (reader.delete(t) > 0) {
> out.println("This file was already in the index,
> replacing it.<br>");
> }
>
> // add document to index
> writer = new IndexWriter(indexPath, new MyAnalyzer(), false);
> Document doc = new Document();
> doc.add(Field.Keyword("original_path", f.getPath()));
> doc.add(Field.Text("filename", f.getName()));
> doc.add(Field.Text("description", "a description"));
> writer.addDocument(doc);
>
> // clean up
> fsd.close();
> reader.close();
> writer.close();
>
> I then have a class to examine the documents in the index.
> When I first add
> a document it appears correctly, however as I re-add (delete then add
> documents as above), the reader.isDeleted() flag remains set
> for each of the
> re-added documents which would be ok assuming segments have
> not been merged
> yet but the re-added documents do not appear anywhere in the
> reader.documents(i) collection. The code to examine is as follows:
>
> fsd = FSDirectory.getDirectory(indexPath, false);
> reader = IndexReader.open(fsd);
>
> // show document details
> for (int i = 0; i < reader.numDocs(); i ++) {
> if (!reader.isDeleted(i)) {
> Document d = reader.document(i);
> for (Enumeration e = d.fields();
> e.hasMoreElements() ;) {
> Field f = (Field) e.nextElement();
> out.println(f.toString() + ": " +
> f.name() + "=" + f.stringValue() +
> "<br>");
> }
> }
> }
>
> // clean up
> fsd.close();
> reader.close();
> writer.close();
>
deletions you need to close the reader before you open a writer, otherwise
deletions can be lost. You're claiming that additions are lost, but could
it really be that it is the deletions which have been lost? Try closing the
reader before you open the writer and tell me if that helps.
Doug
> -----Original Message-----
> From: Matt Read [mailto:mread@dircon.co.uk]
> Sent: Wednesday, October 24, 2001 10:35 AM
> To: lucene-user@jakarta.apache.org
> Subject: FW: Segments not merging on delete
>
>
> Hi, I'm having some problems with segments failing to merge
> when deleting
> documents from the index. I've followed the recommendations
> in the FAQ for
> avoiding a complete index rebuild when adding a new document, i.e. I'm
> deleting the document from the index and the re-adding it.
> The index however
> is growing even if I just replace the same document
> repeatedly. And although
> the reader.numDocs() method returns the correct value the
> reader.docFreq()
> for each term is increasing.
>
> I'd appreciate any help. Thanks.
>
> When I index a document, I have this code:
>
> // check if it's already there and if it is, delete it
> fsd = FSDirectory.getDirectory(indexPath, false);
> reader = IndexReader.open(fsd);
> Term t = new Term("original_path", f.getPath());
> if (reader.delete(t) > 0) {
> out.println("This file was already in the index,
> replacing it.<br>");
> }
>
> // add document to index
> writer = new IndexWriter(indexPath, new MyAnalyzer(), false);
> Document doc = new Document();
> doc.add(Field.Keyword("original_path", f.getPath()));
> doc.add(Field.Text("filename", f.getName()));
> doc.add(Field.Text("description", "a description"));
> writer.addDocument(doc);
>
> // clean up
> fsd.close();
> reader.close();
> writer.close();
>
> I then have a class to examine the documents in the index.
> When I first add
> a document it appears correctly, however as I re-add (delete then add
> documents as above), the reader.isDeleted() flag remains set
> for each of the
> re-added documents which would be ok assuming segments have
> not been merged
> yet but the re-added documents do not appear anywhere in the
> reader.documents(i) collection. The code to examine is as follows:
>
> fsd = FSDirectory.getDirectory(indexPath, false);
> reader = IndexReader.open(fsd);
>
> // show document details
> for (int i = 0; i < reader.numDocs(); i ++) {
> if (!reader.isDeleted(i)) {
> Document d = reader.document(i);
> for (Enumeration e = d.fields();
> e.hasMoreElements() ;) {
> Field f = (Field) e.nextElement();
> out.println(f.toString() + ": " +
> f.name() + "=" + f.stringValue() +
> "<br>");
> }
> }
> }
>
> // clean up
> fsd.close();
> reader.close();
> writer.close();
>