Mailing List Archive

EOF error in VectorValues in Lucene nightly benchmarks
Hi!

I recently posted about profiling the nightly benchmarks, and in the
process of doing that I came across an exception that looks rather nasty. I
was about to create a Jira issue, but the nice Jira submit form told me to
post it here or in irc first.

The Lucene commit where I saw this was
eb24e95731b9f865b95b821c1745264fdc58119, which was the master branch's HEAD
this Saturday. It's is an EOF error, which seems to happen when reading
vector values:

Exception in thread "Thread-1" java.lang.RuntimeException:
java.lang.RuntimeException: java.io.EOFException: seek past EOF:
MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
[slice=vector-data]
at perf.TaskThreads$TaskThread.run(TaskThreads.java:105)
Caused by: java.lang.RuntimeException: java.io.EOFException: seek past EOF:
MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
[slice=vector-data]
at perf.SearchTask.go(SearchTask.java:322)
at perf.TaskThreads$TaskThread.run(TaskThreads.java:91)
Caused by: java.io.EOFException: seek past EOF:
MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
[slice=vector-data]
at
org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:255)
at
org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl.seek(ByteBufferIndexInput.java:575)
at
org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.vectorValue(Lucene90VectorReader.java:432)
at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:118)
at
org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.search(Lucene90VectorReader.java:409)
at perf.KnnQuery$KnnWeight.scorer(KnnQuery.java:88)
at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:664)
at
org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:510)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:520)
at perf.SearchTask.go(SearchTask.java:263)

Sorry about the long stack trace in an email - still not quite up to speed
with etiquette on this mailing list, should I have attached a file instead?

I have reproduced it 3 times on my machine, running the nightly benchmark's
search benchmark with the competition.WIKI_MEDIUM_ALL source. It does not
occur with e.g. the competition.WIKI_MEDIUM_10M source. The source of the
competition file is in
https://github.com/mikemccand/luceneutil/blob/master/src/python/competition.py.
I run the indexing with only 1 thread - not sure if that matters.

I'll happily provide more information regarding system setup etc. if that
can help in figuring things out.

best regards,
Anton Hägerstrand
Re: EOF error in VectorValues in Lucene nightly benchmarks [ In reply to ]
Thanks Anton! Providing a stack trace is great; we can afford the
black pixels I think ;)

I found some bugs in a separate effort which might be related,
although I saw a different exception so I'm not sure. I'll post a
patch soon, and if you are able to re-test and see if you can
reproduce, that would be a huge help!

-Mike

On Thu, Jan 21, 2021 at 4:37 PM Anton Hägerstrand <anton@blunders.io> wrote:
>
> Hi!
>
> I recently posted about profiling the nightly benchmarks, and in the process of doing that I came across an exception that looks rather nasty. I was about to create a Jira issue, but the nice Jira submit form told me to post it here or in irc first.
>
> The Lucene commit where I saw this was eb24e95731b9f865b95b821c1745264fdc58119, which was the master branch's HEAD this Saturday. It's is an EOF error, which seems to happen when reading vector values:
>
> Exception in thread "Thread-1" java.lang.RuntimeException: java.lang.RuntimeException: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> at perf.TaskThreads$TaskThread.run(TaskThreads.java:105)
> Caused by: java.lang.RuntimeException: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> at perf.SearchTask.go(SearchTask.java:322)
> at perf.TaskThreads$TaskThread.run(TaskThreads.java:91)
> Caused by: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> at org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:255)
> at org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl.seek(ByteBufferIndexInput.java:575)
> at org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.vectorValue(Lucene90VectorReader.java:432)
> at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:118)
> at org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.search(Lucene90VectorReader.java:409)
> at perf.KnnQuery$KnnWeight.scorer(KnnQuery.java:88)
> at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:664)
> at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:510)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:520)
> at perf.SearchTask.go(SearchTask.java:263)
>
> Sorry about the long stack trace in an email - still not quite up to speed with etiquette on this mailing list, should I have attached a file instead?
>
> I have reproduced it 3 times on my machine, running the nightly benchmark's search benchmark with the competition.WIKI_MEDIUM_ALL source. It does not occur with e.g. the competition.WIKI_MEDIUM_10M source. The source of the competition file is in https://github.com/mikemccand/luceneutil/blob/master/src/python/competition.py. I run the indexing with only 1 thread - not sure if that matters.
>
> I'll happily provide more information regarding system setup etc. if that can help in figuring things out.
>
> best regards,
> Anton Hägerstrand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: EOF error in VectorValues in Lucene nightly benchmarks [ In reply to ]
@Anton; see https://github.com/apache/lucene-solr/pull/2239. I did run
some luceneutil vector tasks, and did not see any exceptions in the
logs; then again I did not see them either with or without this patch.
I think you should definitely open an issue, and please post the
command you ran that generated the exceptions.

On Fri, Jan 22, 2021 at 4:21 PM Michael Sokolov <msokolov@gmail.com> wrote:
>
> Thanks Anton! Providing a stack trace is great; we can afford the
> black pixels I think ;)
>
> I found some bugs in a separate effort which might be related,
> although I saw a different exception so I'm not sure. I'll post a
> patch soon, and if you are able to re-test and see if you can
> reproduce, that would be a huge help!
>
> -Mike
>
> On Thu, Jan 21, 2021 at 4:37 PM Anton Hägerstrand <anton@blunders.io> wrote:
> >
> > Hi!
> >
> > I recently posted about profiling the nightly benchmarks, and in the process of doing that I came across an exception that looks rather nasty. I was about to create a Jira issue, but the nice Jira submit form told me to post it here or in irc first.
> >
> > The Lucene commit where I saw this was eb24e95731b9f865b95b821c1745264fdc58119, which was the master branch's HEAD this Saturday. It's is an EOF error, which seems to happen when reading vector values:
> >
> > Exception in thread "Thread-1" java.lang.RuntimeException: java.lang.RuntimeException: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> > at perf.TaskThreads$TaskThread.run(TaskThreads.java:105)
> > Caused by: java.lang.RuntimeException: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> > at perf.SearchTask.go(SearchTask.java:322)
> > at perf.TaskThreads$TaskThread.run(TaskThreads.java:91)
> > Caused by: java.io.EOFException: seek past EOF: MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec") [slice=vector-data]
> > at org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:255)
> > at org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl.seek(ByteBufferIndexInput.java:575)
> > at org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.vectorValue(Lucene90VectorReader.java:432)
> > at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:118)
> > at org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.search(Lucene90VectorReader.java:409)
> > at perf.KnnQuery$KnnWeight.scorer(KnnQuery.java:88)
> > at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:664)
> > at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:510)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:520)
> > at perf.SearchTask.go(SearchTask.java:263)
> >
> > Sorry about the long stack trace in an email - still not quite up to speed with etiquette on this mailing list, should I have attached a file instead?
> >
> > I have reproduced it 3 times on my machine, running the nightly benchmark's search benchmark with the competition.WIKI_MEDIUM_ALL source. It does not occur with e.g. the competition.WIKI_MEDIUM_10M source. The source of the competition file is in https://github.com/mikemccand/luceneutil/blob/master/src/python/competition.py. I run the indexing with only 1 thread - not sure if that matters.
> >
> > I'll happily provide more information regarding system setup etc. if that can help in figuring things out.
> >
> > best regards,
> > Anton Hägerstrand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: EOF error in VectorValues in Lucene nightly benchmarks [ In reply to ]
Hi! Sorry about the late reply. I tried running with your PR applied: the
EOF unfortunately still occurred. I have created
https://issues.apache.org/jira/browse/LUCENE-9715 for this. Let me know if
you would like me to try to run with some other patch or configuration.

Thanks,
Anton

On Sat, 23 Jan 2021 at 21:57, Michael Sokolov <msokolov@gmail.com> wrote:

> @Anton; see https://github.com/apache/lucene-solr/pull/2239. I did run
> some luceneutil vector tasks, and did not see any exceptions in the
> logs; then again I did not see them either with or without this patch.
> I think you should definitely open an issue, and please post the
> command you ran that generated the exceptions.
>
> On Fri, Jan 22, 2021 at 4:21 PM Michael Sokolov <msokolov@gmail.com>
> wrote:
> >
> > Thanks Anton! Providing a stack trace is great; we can afford the
> > black pixels I think ;)
> >
> > I found some bugs in a separate effort which might be related,
> > although I saw a different exception so I'm not sure. I'll post a
> > patch soon, and if you are able to re-test and see if you can
> > reproduce, that would be a huge help!
> >
> > -Mike
> >
> > On Thu, Jan 21, 2021 at 4:37 PM Anton Hägerstrand <anton@blunders.io>
> wrote:
> > >
> > > Hi!
> > >
> > > I recently posted about profiling the nightly benchmarks, and in the
> process of doing that I came across an exception that looks rather nasty. I
> was about to create a Jira issue, but the nice Jira submit form told me to
> post it here or in irc first.
> > >
> > > The Lucene commit where I saw this was
> eb24e95731b9f865b95b821c1745264fdc58119, which was the master branch's HEAD
> this Saturday. It's is an EOF error, which seems to happen when reading
> vector values:
> > >
> > > Exception in thread "Thread-1" java.lang.RuntimeException:
> java.lang.RuntimeException: java.io.EOFException: seek past EOF:
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
> [slice=vector-data]
> > > at perf.TaskThreads$TaskThread.run(TaskThreads.java:105)
> > > Caused by: java.lang.RuntimeException: java.io.EOFException: seek past
> EOF:
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
> [slice=vector-data]
> > > at perf.SearchTask.go(SearchTask.java:322)
> > > at perf.TaskThreads$TaskThread.run(TaskThreads.java:91)
> > > Caused by: java.io.EOFException: seek past EOF:
> MMapIndexInput(path="/home/anton/dev/lucene-bench-home/indices/lucene_bench_2021-01-17_eb24e95_medium_1thread/index/_32.vec")
> [slice=vector-data]
> > > at
> org.apache.lucene.store.ByteBufferIndexInput.seek(ByteBufferIndexInput.java:255)
> > > at
> org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl.seek(ByteBufferIndexInput.java:575)
> > > at
> org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.vectorValue(Lucene90VectorReader.java:432)
> > > at org.apache.lucene.util.hnsw.HnswGraph.search(HnswGraph.java:118)
> > > at
> org.apache.lucene.codecs.lucene90.Lucene90VectorReader$OffHeapVectorValues.search(Lucene90VectorReader.java:409)
> > > at perf.KnnQuery$KnnWeight.scorer(KnnQuery.java:88)
> > > at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
> > > at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
> > > at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
> > > at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:664)
> > > at
> org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:510)
> > > at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:520)
> > > at perf.SearchTask.go(SearchTask.java:263)
> > >
> > > Sorry about the long stack trace in an email - still not quite up to
> speed with etiquette on this mailing list, should I have attached a file
> instead?
> > >
> > > I have reproduced it 3 times on my machine, running the nightly
> benchmark's search benchmark with the competition.WIKI_MEDIUM_ALL source.
> It does not occur with e.g. the competition.WIKI_MEDIUM_10M source. The
> source of the competition file is in
> https://github.com/mikemccand/luceneutil/blob/master/src/python/competition.py.
> I run the indexing with only 1 thread - not sure if that matters.
> > >
> > > I'll happily provide more information regarding system setup etc. if
> that can help in figuring things out.
> > >
> > > best regards,
> > > Anton Hägerstrand
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>