Mailing List Archive

Windows issue with "Using MemorySegmentIndexInput with Java 20" (?)
I know that there isn’t enough information here to actually solve anything but I’m pretty sure that I’m pointing on a real issue and it might help in the future.

I’ll describe what happened:

1. My dev machine is a Windows 11 PC with Java 20.
2. I’m not modifying the index on the dev machine. I’m working on a snapshot taken from the production server. The index size is about 60gb.
3. A few days ago I started the local server and was surprised to see that the index is corrupt. It failed to decompress a stored field. Something deep inside Lucene.
4. A bit worried, I took another copy of the same snapshot, replaced the existing index and it worked. I thought that maybe Windows antivirus was responsible for this.
Note that the last modified timestamps of all files, except of “write.lock” were unchanged. Still showing the snapshot time.
5. This morning it happened again, but this time I kept the old index in a separate folder. I binary compared the old index to the new index and they were identical (except of write.lock)
6. I replaced the two indices again, running the local server with the “corrupt” index and it worked.
7. I’ve upgraded to Java 20 a few weeks ago. Such issues never happened in the last two years.



My (uneducated) guess is that an OS cache gets corrupt at some point and it might be related to the new mmap index.



Erel
Re: Windows issue with "Using MemorySegmentIndexInput with Java 20" (?) [ In reply to ]
Hi Erel,

3. A few days ago I started the local server and was surprised to see
> that the index is corrupt. It failed to decompress a stored field.
> Something deep inside Lucene.
>

If you can include a stack trace, it would be great. Also, try running
CheckIndex on that index to see if it says anything.


> 7. I’ve upgraded to Java 20 a few weeks ago. Such issues never
> happened in the last two years.
>

Hard to tell. You could try running with the previous version of Java and
see if this happens there too? It could be a bug somewhere in the new
implementation or some other quirk. A repro would be ideal, if you can
isolate it.

Dawid