Hello everyone,
Recently we had a failed segment merge caused by "No space left on device".
After restart, Lucene failed with the CorruptIndexException.
The expectation was that Lucene automatically recovers in such
case, because there was no succesul commit. Is it a correct assumption, or
I am missing something?
It would be great to know any recommendations to avoid such situations
in future and be able to recover automatically after restart.
Lucene version is 8.5.0
Failed merge stacktrace:
2021-02-02T08:51:51.679+0000
org.apache.lucene.index.MergePolicy$MergeException:
java.io.IOException: No space left on device
at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:704)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
Caused by: java.io.IOException: No space left on device
at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79)
at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280)
at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)
at java.base/java.nio.channels.Channels.writeFully(Channels.java:97)
at java.base/java.nio.channels.Channels$1.write(Channels.java:172)
at org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.java:416)
at java.base/java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:74)
at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)
at org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:53)
at org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:73)
at org.apache.lucene.util.compress.LZ4.encodeLiterals(LZ4.java:159)
at org.apache.lucene.util.compress.LZ4.encodeSequence(LZ4.java:172)
at org.apache.lucene.util.compress.LZ4.compress(LZ4.java:441)
at org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:165)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:229)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:159)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.merge(CompressingStoredFieldsWriter.java:636)
at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:229)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4463)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4057)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
Followed by failed startup:
2021-02-02T08:52:07.926+0000
org.apache.lucene.index.CorruptIndexException: Unexpected file read
error while reading index.
(resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/5f91aa0b07ce4d5e7beffaa2/segments_578fu")))
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:846)
Caused by: java.nio.file.NoSuchFileException:
/data/5f91aa0b07ce4d5e7beffaa2/_6lfem.si
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:182)
at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)
at java.base/java.nio.channels.FileChannel.open(FileChannel.java:345)
at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:81)
at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157)
at org.apache.lucene.codecs.lucene70.Lucene70SegmentInfoFormat.read(Lucene70SegmentInfoFormat.java:91)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:353)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
... 33 common frames omitted
Thank you!
--
Regards,
Alexander L
Recently we had a failed segment merge caused by "No space left on device".
After restart, Lucene failed with the CorruptIndexException.
The expectation was that Lucene automatically recovers in such
case, because there was no succesul commit. Is it a correct assumption, or
I am missing something?
It would be great to know any recommendations to avoid such situations
in future and be able to recover automatically after restart.
Lucene version is 8.5.0
Failed merge stacktrace:
2021-02-02T08:51:51.679+0000
org.apache.lucene.index.MergePolicy$MergeException:
java.io.IOException: No space left on device
at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:704)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
Caused by: java.io.IOException: No space left on device
at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:79)
at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:280)
at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)
at java.base/java.nio.channels.Channels.writeFully(Channels.java:97)
at java.base/java.nio.channels.Channels$1.write(Channels.java:172)
at org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.java:416)
at java.base/java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:74)
at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)
at org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:53)
at org.apache.lucene.store.RateLimitedIndexOutput.writeBytes(RateLimitedIndexOutput.java:73)
at org.apache.lucene.util.compress.LZ4.encodeLiterals(LZ4.java:159)
at org.apache.lucene.util.compress.LZ4.encodeSequence(LZ4.java:172)
at org.apache.lucene.util.compress.LZ4.compress(LZ4.java:441)
at org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:165)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:229)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:159)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.merge(CompressingStoredFieldsWriter.java:636)
at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:229)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:106)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4463)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4057)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
Followed by failed startup:
2021-02-02T08:52:07.926+0000
org.apache.lucene.index.CorruptIndexException: Unexpected file read
error while reading index.
(resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/5f91aa0b07ce4d5e7beffaa2/segments_578fu")))
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:846)
Caused by: java.nio.file.NoSuchFileException:
/data/5f91aa0b07ce4d5e7beffaa2/_6lfem.si
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:182)
at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)
at java.base/java.nio.channels.FileChannel.open(FileChannel.java:345)
at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:81)
at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157)
at org.apache.lucene.codecs.lucene70.Lucene70SegmentInfoFormat.read(Lucene70SegmentInfoFormat.java:91)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:353)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
... 33 common frames omitted
Thank you!
--
Regards,
Alexander L