Mailing List Archive

Exploring a different approach to skip lists
Hey folks-

I've been experimenting with a different approach to implementing skip
lists in Lucene, and would be curious if anyone else has tried
something similar, or has early thoughts/feedback on what I'm doing.

The idea is generally a simplification of what Lucene currently does,
targeted at improving QPS at search-time. Instead of treating skip
lists as forward-iterating structures, I'm indexing a single, flat
structure that I binary search when advancing. I'm indexing the same
data present in the lowest level of the current structures (i.e., skip
pointers to each block of 128 docs), and then binary searching those
pointers to find the candidate block for the target docID (instead of
only seeking forward).

Before describing this idea in more detail (or opening a Jira), I'd be
curious how this community thinks about disk access operations and
what platforms/use-cases we primarily optimize for these days. This
approach I've been experimenting with is essentially a non-starter if
we assume that index accesses generally involve actually going to
disk—especially if those disks spin. Random seeks all over the place
are probably a terrible idea if those are actually hitting disk. In
the cases I've been experimenting with, indexes are generally hot, so
random seeks aren't much of a concern. Do we tend to optimize with
this assumption in mind, or are we still pretty careful about
use-cases that are actually doing a lot of disk IO?

There are some other tricky implications with the approach I'm
experimenting with (some edge cases around Impacts and index size
growth due to not being able to do as much delta-encoding), but it's
not worth going into those until addressing the main idea of
whether-or-not random seeks even make sense.

I've done some early benchmarking with luceneutil (wikimedium10m
specifically) and the idea looks like it might have some promise. I
don't really see any regressions to QPS*, while a few tasks seem to
show significant QPS improvements (AndHighLow: 9.8%, OrNotHighLow:
11.1%, OrHighLow: 12.3%).
* (note: I've disabled Impacts in both baseline/candidate for early
experiments because of some challenges related to them... so these
results have a major asteriks right now and more work would certainly
need to be done)

Thanks in advance for any feedback! I'm completely open to shooting
down this idea early if there are some fundamental flaws, or
alternatively opening up a Jira to investigate this further if folks
think it's reasonable to explore more :)

Cheers,
-Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Exploring a different approach to skip lists [ In reply to ]
Hi Greg,

I like that Lucene can scale to index sizes that are much larger than the
amount of main memory, so I would like the default codec to keep optimizing
for sequential reads. We do random access for some parts of the index like
the terms index and the points index, but the expectation is that they are
so small that would always fit in the page cache (they were actually in the
JVM heap not long ago). A skip index for every postings list feels like
something that could be much larger than the terms index so I don't think
it would be a good fit for the default codec.

We could still implement your idea in lucene/codecs for users who can force
load their index into memory, the speedup looks significant for queries
that do lots of skipping!

These results make me wonder if there are other ways we could get similar
benefits while keeping a sequential access pattern when reading postings?

Le mar. 27 avr. 2021 à 21:03, Greg Miller <gsmiller@gmail.com> a écrit :

> Hey folks-
>
> I've been experimenting with a different approach to implementing skip
> lists in Lucene, and would be curious if anyone else has tried
> something similar, or has early thoughts/feedback on what I'm doing.
>
> The idea is generally a simplification of what Lucene currently does,
> targeted at improving QPS at search-time. Instead of treating skip
> lists as forward-iterating structures, I'm indexing a single, flat
> structure that I binary search when advancing. I'm indexing the same
> data present in the lowest level of the current structures (i.e., skip
> pointers to each block of 128 docs), and then binary searching those
> pointers to find the candidate block for the target docID (instead of
> only seeking forward).
>
> Before describing this idea in more detail (or opening a Jira), I'd be
> curious how this community thinks about disk access operations and
> what platforms/use-cases we primarily optimize for these days. This
> approach I've been experimenting with is essentially a non-starter if
> we assume that index accesses generally involve actually going to
> disk—especially if those disks spin. Random seeks all over the place
> are probably a terrible idea if those are actually hitting disk. In
> the cases I've been experimenting with, indexes are generally hot, so
> random seeks aren't much of a concern. Do we tend to optimize with
> this assumption in mind, or are we still pretty careful about
> use-cases that are actually doing a lot of disk IO?
>
> There are some other tricky implications with the approach I'm
> experimenting with (some edge cases around Impacts and index size
> growth due to not being able to do as much delta-encoding), but it's
> not worth going into those until addressing the main idea of
> whether-or-not random seeks even make sense.
>
> I've done some early benchmarking with luceneutil (wikimedium10m
> specifically) and the idea looks like it might have some promise. I
> don't really see any regressions to QPS*, while a few tasks seem to
> show significant QPS improvements (AndHighLow: 9.8%, OrNotHighLow:
> 11.1%, OrHighLow: 12.3%).
> * (note: I've disabled Impacts in both baseline/candidate for early
> experiments because of some challenges related to them... so these
> results have a major asteriks right now and more work would certainly
> need to be done)
>
> Thanks in advance for any feedback! I'm completely open to shooting
> down this idea early if there are some fundamental flaws, or
> alternatively opening up a Jira to investigate this further if folks
> think it's reasonable to explore more :)
>
> Cheers,
> -Greg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: Exploring a different approach to skip lists [ In reply to ]
Thanks for the reply Adrien! It makes sense to ensure the default
codec continues to scale for large indexes that can't fit in a
machine's physical memory. Thanks for the thoughts and context on the
terms/points indexes, etc.

I'll look into how this idea could be spun off into a separate
lucene/codecs implementation and open a Jira issue to track that work
a little later this week. I'll also spend a little time thinking about
whether-or-not there might be some sort of hybrid solution that
enables some of these gains while maintaining the sequential access. I
don't have anything off the top-of-my-head, but maybe if I put a draft
of my change up as a PR (under lucene/codecs) it will spark some other
ideas!

Thanks again for your thoughts!

Cheers,
-Greg

On Tue, Apr 27, 2021 at 2:23 PM Adrien Grand <jpountz@gmail.com> wrote:
>
> Hi Greg,
>
> I like that Lucene can scale to index sizes that are much larger than the amount of main memory, so I would like the default codec to keep optimizing for sequential reads. We do random access for some parts of the index like the terms index and the points index, but the expectation is that they are so small that would always fit in the page cache (they were actually in the JVM heap not long ago). A skip index for every postings list feels like something that could be much larger than the terms index so I don't think it would be a good fit for the default codec.
>
> We could still implement your idea in lucene/codecs for users who can force load their index into memory, the speedup looks significant for queries that do lots of skipping!
>
> These results make me wonder if there are other ways we could get similar benefits while keeping a sequential access pattern when reading postings?
>
> Le mar. 27 avr. 2021 à 21:03, Greg Miller <gsmiller@gmail.com> a écrit :
>>
>> Hey folks-
>>
>> I've been experimenting with a different approach to implementing skip
>> lists in Lucene, and would be curious if anyone else has tried
>> something similar, or has early thoughts/feedback on what I'm doing.
>>
>> The idea is generally a simplification of what Lucene currently does,
>> targeted at improving QPS at search-time. Instead of treating skip
>> lists as forward-iterating structures, I'm indexing a single, flat
>> structure that I binary search when advancing. I'm indexing the same
>> data present in the lowest level of the current structures (i.e., skip
>> pointers to each block of 128 docs), and then binary searching those
>> pointers to find the candidate block for the target docID (instead of
>> only seeking forward).
>>
>> Before describing this idea in more detail (or opening a Jira), I'd be
>> curious how this community thinks about disk access operations and
>> what platforms/use-cases we primarily optimize for these days. This
>> approach I've been experimenting with is essentially a non-starter if
>> we assume that index accesses generally involve actually going to
>> disk—especially if those disks spin. Random seeks all over the place
>> are probably a terrible idea if those are actually hitting disk. In
>> the cases I've been experimenting with, indexes are generally hot, so
>> random seeks aren't much of a concern. Do we tend to optimize with
>> this assumption in mind, or are we still pretty careful about
>> use-cases that are actually doing a lot of disk IO?
>>
>> There are some other tricky implications with the approach I'm
>> experimenting with (some edge cases around Impacts and index size
>> growth due to not being able to do as much delta-encoding), but it's
>> not worth going into those until addressing the main idea of
>> whether-or-not random seeks even make sense.
>>
>> I've done some early benchmarking with luceneutil (wikimedium10m
>> specifically) and the idea looks like it might have some promise. I
>> don't really see any regressions to QPS*, while a few tasks seem to
>> show significant QPS improvements (AndHighLow: 9.8%, OrNotHighLow:
>> 11.1%, OrHighLow: 12.3%).
>> * (note: I've disabled Impacts in both baseline/candidate for early
>> experiments because of some challenges related to them... so these
>> results have a major asteriks right now and more work would certainly
>> need to be done)
>>
>> Thanks in advance for any feedback! I'm completely open to shooting
>> down this idea early if there are some fundamental flaws, or
>> alternatively opening up a Jira to investigate this further if folks
>> think it's reasonable to explore more :)
>>
>> Cheers,
>> -Greg
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Exploring a different approach to skip lists [ In reply to ]
+1 to explore innovations in Lucene's skip-list implementation!

Your initial impressive gains show that indeed skipping is a big cost for
certain queries.

Lucene's multi-level skip list implementation is quite old now (I think
more than a decade?) and I think rather complex. There is a fun paper
describing a simple and probabilistic approach to inline the (still
multi-level) skip data, linked from this long-ago issue:
https://issues.apache.org/jira/browse/LUCENE-2962 That would be a way to
keep the sequential (slow disk friendly) access and perhaps reduce the cost
versus Lucenes skipping implementation today.

But I do love the simplicity of a simple flat array of the bottom-most skip
data. Having that as an optional (not default) Codec would be a nice
option.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Apr 27, 2021 at 7:42 PM Greg Miller <gsmiller@gmail.com> wrote:

> Thanks for the reply Adrien! It makes sense to ensure the default
> codec continues to scale for large indexes that can't fit in a
> machine's physical memory. Thanks for the thoughts and context on the
> terms/points indexes, etc.
>
> I'll look into how this idea could be spun off into a separate
> lucene/codecs implementation and open a Jira issue to track that work
> a little later this week. I'll also spend a little time thinking about
> whether-or-not there might be some sort of hybrid solution that
> enables some of these gains while maintaining the sequential access. I
> don't have anything off the top-of-my-head, but maybe if I put a draft
> of my change up as a PR (under lucene/codecs) it will spark some other
> ideas!
>
> Thanks again for your thoughts!
>
> Cheers,
> -Greg
>
> On Tue, Apr 27, 2021 at 2:23 PM Adrien Grand <jpountz@gmail.com> wrote:
> >
> > Hi Greg,
> >
> > I like that Lucene can scale to index sizes that are much larger than
> the amount of main memory, so I would like the default codec to keep
> optimizing for sequential reads. We do random access for some parts of the
> index like the terms index and the points index, but the expectation is
> that they are so small that would always fit in the page cache (they were
> actually in the JVM heap not long ago). A skip index for every postings
> list feels like something that could be much larger than the terms index so
> I don't think it would be a good fit for the default codec.
> >
> > We could still implement your idea in lucene/codecs for users who can
> force load their index into memory, the speedup looks significant for
> queries that do lots of skipping!
> >
> > These results make me wonder if there are other ways we could get
> similar benefits while keeping a sequential access pattern when reading
> postings?
> >
> > Le mar. 27 avr. 2021 à 21:03, Greg Miller <gsmiller@gmail.com> a écrit :
> >>
> >> Hey folks-
> >>
> >> I've been experimenting with a different approach to implementing skip
> >> lists in Lucene, and would be curious if anyone else has tried
> >> something similar, or has early thoughts/feedback on what I'm doing.
> >>
> >> The idea is generally a simplification of what Lucene currently does,
> >> targeted at improving QPS at search-time. Instead of treating skip
> >> lists as forward-iterating structures, I'm indexing a single, flat
> >> structure that I binary search when advancing. I'm indexing the same
> >> data present in the lowest level of the current structures (i.e., skip
> >> pointers to each block of 128 docs), and then binary searching those
> >> pointers to find the candidate block for the target docID (instead of
> >> only seeking forward).
> >>
> >> Before describing this idea in more detail (or opening a Jira), I'd be
> >> curious how this community thinks about disk access operations and
> >> what platforms/use-cases we primarily optimize for these days. This
> >> approach I've been experimenting with is essentially a non-starter if
> >> we assume that index accesses generally involve actually going to
> >> disk—especially if those disks spin. Random seeks all over the place
> >> are probably a terrible idea if those are actually hitting disk. In
> >> the cases I've been experimenting with, indexes are generally hot, so
> >> random seeks aren't much of a concern. Do we tend to optimize with
> >> this assumption in mind, or are we still pretty careful about
> >> use-cases that are actually doing a lot of disk IO?
> >>
> >> There are some other tricky implications with the approach I'm
> >> experimenting with (some edge cases around Impacts and index size
> >> growth due to not being able to do as much delta-encoding), but it's
> >> not worth going into those until addressing the main idea of
> >> whether-or-not random seeks even make sense.
> >>
> >> I've done some early benchmarking with luceneutil (wikimedium10m
> >> specifically) and the idea looks like it might have some promise. I
> >> don't really see any regressions to QPS*, while a few tasks seem to
> >> show significant QPS improvements (AndHighLow: 9.8%, OrNotHighLow:
> >> 11.1%, OrHighLow: 12.3%).
> >> * (note: I've disabled Impacts in both baseline/candidate for early
> >> experiments because of some challenges related to them... so these
> >> results have a major asteriks right now and more work would certainly
> >> need to be done)
> >>
> >> Thanks in advance for any feedback! I'm completely open to shooting
> >> down this idea early if there are some fundamental flaws, or
> >> alternatively opening up a Jira to investigate this further if folks
> >> think it's reasonable to explore more :)
> >>
> >> Cheers,
> >> -Greg
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: Exploring a different approach to skip lists [ In reply to ]
Thanks Mike! I've started reading that paper, and it's interesting
indeed! I don't have much time to explore this further at the moment,
but I plan to revisit when I do. It'd be fun to explore the idea in
the paper to see if a "best of both worlds" approach is possible
(speedups of binary searching while maintaining forward-iteration
paradigm)! It would be interesting to compare that implementation to
the simple, flat binary-searched list as well.

Cheers,
-Greg

On Mon, May 3, 2021 at 10:31 AM Michael McCandless
<lucene@mikemccandless.com> wrote:
>
> +1 to explore innovations in Lucene's skip-list implementation!
>
> Your initial impressive gains show that indeed skipping is a big cost for certain queries.
>
> Lucene's multi-level skip list implementation is quite old now (I think more than a decade?) and I think rather complex. There is a fun paper describing a simple and probabilistic approach to inline the (still multi-level) skip data, linked from this long-ago issue: https://issues.apache.org/jira/browse/LUCENE-2962 That would be a way to keep the sequential (slow disk friendly) access and perhaps reduce the cost versus Lucenes skipping implementation today.
>
> But I do love the simplicity of a simple flat array of the bottom-most skip data. Having that as an optional (not default) Codec would be a nice option.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Apr 27, 2021 at 7:42 PM Greg Miller <gsmiller@gmail.com> wrote:
>>
>> Thanks for the reply Adrien! It makes sense to ensure the default
>> codec continues to scale for large indexes that can't fit in a
>> machine's physical memory. Thanks for the thoughts and context on the
>> terms/points indexes, etc.
>>
>> I'll look into how this idea could be spun off into a separate
>> lucene/codecs implementation and open a Jira issue to track that work
>> a little later this week. I'll also spend a little time thinking about
>> whether-or-not there might be some sort of hybrid solution that
>> enables some of these gains while maintaining the sequential access. I
>> don't have anything off the top-of-my-head, but maybe if I put a draft
>> of my change up as a PR (under lucene/codecs) it will spark some other
>> ideas!
>>
>> Thanks again for your thoughts!
>>
>> Cheers,
>> -Greg
>>
>> On Tue, Apr 27, 2021 at 2:23 PM Adrien Grand <jpountz@gmail.com> wrote:
>> >
>> > Hi Greg,
>> >
>> > I like that Lucene can scale to index sizes that are much larger than the amount of main memory, so I would like the default codec to keep optimizing for sequential reads. We do random access for some parts of the index like the terms index and the points index, but the expectation is that they are so small that would always fit in the page cache (they were actually in the JVM heap not long ago). A skip index for every postings list feels like something that could be much larger than the terms index so I don't think it would be a good fit for the default codec.
>> >
>> > We could still implement your idea in lucene/codecs for users who can force load their index into memory, the speedup looks significant for queries that do lots of skipping!
>> >
>> > These results make me wonder if there are other ways we could get similar benefits while keeping a sequential access pattern when reading postings?
>> >
>> > Le mar. 27 avr. 2021 à 21:03, Greg Miller <gsmiller@gmail.com> a écrit :
>> >>
>> >> Hey folks-
>> >>
>> >> I've been experimenting with a different approach to implementing skip
>> >> lists in Lucene, and would be curious if anyone else has tried
>> >> something similar, or has early thoughts/feedback on what I'm doing.
>> >>
>> >> The idea is generally a simplification of what Lucene currently does,
>> >> targeted at improving QPS at search-time. Instead of treating skip
>> >> lists as forward-iterating structures, I'm indexing a single, flat
>> >> structure that I binary search when advancing. I'm indexing the same
>> >> data present in the lowest level of the current structures (i.e., skip
>> >> pointers to each block of 128 docs), and then binary searching those
>> >> pointers to find the candidate block for the target docID (instead of
>> >> only seeking forward).
>> >>
>> >> Before describing this idea in more detail (or opening a Jira), I'd be
>> >> curious how this community thinks about disk access operations and
>> >> what platforms/use-cases we primarily optimize for these days. This
>> >> approach I've been experimenting with is essentially a non-starter if
>> >> we assume that index accesses generally involve actually going to
>> >> disk—especially if those disks spin. Random seeks all over the place
>> >> are probably a terrible idea if those are actually hitting disk. In
>> >> the cases I've been experimenting with, indexes are generally hot, so
>> >> random seeks aren't much of a concern. Do we tend to optimize with
>> >> this assumption in mind, or are we still pretty careful about
>> >> use-cases that are actually doing a lot of disk IO?
>> >>
>> >> There are some other tricky implications with the approach I'm
>> >> experimenting with (some edge cases around Impacts and index size
>> >> growth due to not being able to do as much delta-encoding), but it's
>> >> not worth going into those until addressing the main idea of
>> >> whether-or-not random seeks even make sense.
>> >>
>> >> I've done some early benchmarking with luceneutil (wikimedium10m
>> >> specifically) and the idea looks like it might have some promise. I
>> >> don't really see any regressions to QPS*, while a few tasks seem to
>> >> show significant QPS improvements (AndHighLow: 9.8%, OrNotHighLow:
>> >> 11.1%, OrHighLow: 12.3%).
>> >> * (note: I've disabled Impacts in both baseline/candidate for early
>> >> experiments because of some challenges related to them... so these
>> >> results have a major asteriks right now and more work would certainly
>> >> need to be done)
>> >>
>> >> Thanks in advance for any feedback! I'm completely open to shooting
>> >> down this idea early if there are some fundamental flaws, or
>> >> alternatively opening up a Jira to investigate this further if folks
>> >> think it's reasonable to explore more :)
>> >>
>> >> Cheers,
>> >> -Greg
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org