Mailing List Archive

Lucene's LRU Query Cache - Deep Dive
Hello!

I work at Amazon Product Search and I’ve recently been looking into understanding how Lucene’s LRU Query Cache works. I’ve written up a summary of my understanding here. (Also attached as a markdown file with this email)

Will really appreciate feedback/improvements/corrections for my understanding and if this is worthy of contributing to the documentation for LRU QueryCache. :)

=================================
A brief overview of Lucene's Query Cache

We first get acquainted with Lucene’s caching at IndexSearcher’s createWeight method which is called for every query (and consequently sub-queries within that query, eg see BooleanWeight) before we can actually find matching documents in our index and score them. Weight is really another representation of a query that is specific to the statistics of the IndexSearcher being used. This definition makes it easier to see why caching logic would start while creating weight for the query - we want to make a weight that will be responsible for caching matching docs per segment. Since segments are specific to the IndexReader being used by the IndexSearcher, they are transitively, specific to the IndexSearcher.

QueryCache in Lucene is an interface that has the signature for just one method - doCache. doCache takes in a Weight (weight of the query in question eg: TermWeight for a TermQuery) and operates on it based on the rules defined by QueryCachingPolicy (yet another interface that defines two methods - onUse and shouldCache) to return a Weight. This “new” returned Weight possibly wraps the original weight and bestows upon it some caching abilities.

As of now, Lucene has one core implementation of the QueryCache and the QueryCachingPolicy. All IndexSearcher instances have a default query cache - LRUQueryCache and use the default policy - UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight method, we first create a weight for the incoming query and then subject it to the LRUQueryCache’s doCache method. An important thing to note here is that we only call doCache when the score mode passed to the search method does not need scores. Calling doCache does nothing complex; it just returns the input weight wrapped as a CachingWrapperWeight that encapsulates the caching policy information. No real caching has happened, yet!

After getting a weight from the createWeight method, the IndexSearcher iterates over each leaf and uses the weight to create a BulkScorer. A BulkScorer, as the name suggests, is used to score a range of the documents - generally the range being all the matches found in that leaf. Given context information for a leaf, every weight should know how to create a bulk scorer for that leaf. In our case, the CachingWrapperWeight’s BulkScorer method does a little bit extra and this is where the actual caching happens!

A brief dive into the query caching policy: While LRU says what we want to evict from a full cache, using query caching policies we can define other rules to use in conjunction with the cache’s design policy. The default UsageTrackingQueryCachingPolicy dictates what queries will be put into the cache. This policy uses a ring buffer data structure optimised to track and retrieve frequencies for a given query. The policy also defines some queries that will not be cached (TermQuery, MatchAllDocsQuery, MatchNoDocsQuery to name a few). The OnUse method on this policy checks if the incoming query is something it would like to cache and if so, it registers the frequency for this query. The UsageTrackingQueryCachingPolicy defines minimum frequencies for queries - 2 for costly queries and 5 for others. When a query that can be cached and has been seen more than minimum number of times, the shouldCache method of the policy gives it a green light for caching.

A brief dive into LRUQueryCache: There are many attributes in this class that track ram usage and allowed max memory size for the cache that play important roles but, for easy understanding, we want to highlight two attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query, Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and accessOrder set to true behaves like an LRU cache and uniqueQueries is just that. While the keys in uniqueQueries are Query instances, the values are also Query instances and not DocIdSet corresponding to the matches as we might expect. This LRU cache is common across all segments and the purpose of this cache (map) is to store a singleton corresponding to the most recently seen queries so that we don’t need to store multiple copies of the same query. cache: The cache maps a LeafCache instance for each leaf in the index represented by a CacheKey. The LeafCache isn’t a cache but is a Map<Query, CacheAndCount> where Query corresponds to the a cached Query instance from uniqueQueries and CacheAndCount is the actual data structure that stores the DocIdSet for the matches found for the query in that leaf.

Returning to the creation of the BulkScorer from the CachingWrapperWeight, we first register the query (and hence its frequency) on the policy via the onUse method. At present, Lucene only wants to cache queries on leaves which have more than 10k documents and have more than 3% of the total number of documents in the index. Once we determine that the segment is eligible for caching, we use a CacheHelper to get the CacheKey for the segment so that we can access the LeafCache corresponding to it. We check if the query has a CacheAndCount entry in the LeafCache and return it (a cache hit!) or return null (a cache miss). In the case of a cache miss, we use the policy’s shouldCache to determine if this query is cache-eligible and if it is, we create a CacheAndCount instance for the query-leaf pair. We add the Query to uniqueQueries and query and CacheAndCount instance to the corresponding LeafCache for that leaf. We’ve finally cached a query and its match set!

When creating a CacheAndCount instance for a query that is being newly cached, we want to find the matches for that query. We use the actual weight for that query (the one wrapped inside the CachingWrapperWeight) to create a BulkScorer and then call score on that scorer to populate a LeafCollector with the matches for the query in this leaf. Based on the density of the result set, we create a DocIdSet over a FixedBitSet or a Roaring bit set.

Finally, we create a DocIdSetIterator from the retrieved or created CacheAndCount instance and return the most commonly used BulkScorer - the DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the DocIdSetIterator. When there is a cache miss and the query is ineligible for caching, we return the BulkScorer from the actual weight wrapped inside the CachingWrapperWeight, as we would if no caching logic were present!

Thanks,
Shradha
Re: Lucene's LRU Query Cache - Deep Dive [ In reply to ]
Hey Shradha,

This correctly describes the what, but I think it could add more color
about why the cache behaves this way to be more useful, e.g.
- Why doesn't the cache cache all queries? Lucene is relatively good at
evaluating a subset of the matching documents, e.g. queries sorted by
numeric field can leverage point index structures to only look at a small
subset of the matching docs. Yet caching a query requires consuming all its
matches, so it could significantly hurt latencies. It's important to not
cache all queries to preserve the benefit of Lucene's filtering and dynamic
pruning capabilities.
- A corollary of the above is that the query cache is essentially a way to
trade latency for throughput. Use-cases that care more about latency than
throughput may want to disable the cache entirely.
- LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
impact of query caching on latency by not caching clauses whose cost is
much higher than the overall query. It only helps for filters within
conjunctions though, not in the dynamic pruning case when we don't know how
many matches are going to be consumed.
- Why are small segments never cached? Small segments are likely going to
be merged soon, so it would be wasteful to build cache entries that would
get evicted shortly.
- The queries that never get cached don't get cached because a cached
entry wouldn't perform faster than their uncached counterpart. An inverted
index is already a cache of the matches for every unique term of the index.

On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2612@gmail.com>
wrote:

> Hello!
>
> I work at Amazon Product Search and I’ve recently been looking into
> understanding how Lucene’s LRU Query Cache works. I’ve written up a summary
> of my understanding here. (Also attached as a markdown file with this email)
>
> Will really appreciate feedback/improvements/corrections for my
> understanding and if this is worthy of contributing to the documentation
> for LRU QueryCache. :)
>
> =================================
> A brief overview of Lucene's Query Cache
>
> We first get acquainted with Lucene’s caching at IndexSearcher’s
> createWeight method which is called for every query (and consequently
> sub-queries within that query, eg see BooleanWeight) before we can actually
> find matching documents in our index and score them. Weight is really
> another representation of a query that is specific to the statistics of the
> IndexSearcher being used. This definition makes it easier to see why
> caching logic would start while creating weight for the query - we want to
> make a weight that will be responsible for caching matching docs per
> segment. Since segments are specific to the IndexReader being used by the
> IndexSearcher, they are transitively, specific to the IndexSearcher.
>
> QueryCache in Lucene is an interface that has the signature for just one
> method - doCache. doCache takes in a Weight (weight of the query in
> question eg: TermWeight for a TermQuery) and operates on it based on the
> rules defined by QueryCachingPolicy (yet another interface that defines two
> methods - onUse and shouldCache) to return a Weight. This “new” returned
> Weight possibly wraps the original weight and bestows upon it some caching
> abilities.
>
> As of now, Lucene has one core implementation of the QueryCache and the
> QueryCachingPolicy. All IndexSearcher instances have a default query cache
> - LRUQueryCache and use the default policy -
> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight method,
> we first create a weight for the incoming query and then subject it to the
> LRUQueryCache’s doCache method. An important thing to note here is that we
> only call doCache when the score mode passed to the search method does not
> need scores. Calling doCache does nothing complex; it just returns the
> input weight wrapped as a CachingWrapperWeight that encapsulates the
> caching policy information. No real caching has happened, yet!
>
> After getting a weight from the createWeight method, the IndexSearcher
> iterates over each leaf and uses the weight to create a BulkScorer. A
> BulkScorer, as the name suggests, is used to score a range of the documents
> - generally the range being all the matches found in that leaf. Given
> context information for a leaf, every weight should know how to create a
> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
> BulkScorer method does a little bit extra and this is where the actual
> caching happens!
>
> A brief dive into the query caching policy: While LRU says what we want to
> evict from a full cache, using query caching policies we can define other
> rules to use in conjunction with the cache’s design policy. The default
> UsageTrackingQueryCachingPolicy dictates what queries will be put into the
> cache. This policy uses a ring buffer data structure optimised to track and
> retrieve frequencies for a given query. The policy also defines some
> queries that will not be cached (TermQuery, MatchAllDocsQuery,
> MatchNoDocsQuery to name a few). The OnUse method on this policy checks if
> the incoming query is something it would like to cache and if so, it
> registers the frequency for this query. The UsageTrackingQueryCachingPolicy
> defines minimum frequencies for queries - 2 for costly queries and 5 for
> others. When a query that can be cached and has been seen more than minimum
> number of times, the shouldCache method of the policy gives it a green
> light for caching.
>
> A brief dive into LRUQueryCache: There are many attributes in this class
> that track ram usage and allowed max memory size for the cache that play
> important roles but, for easy understanding, we want to highlight two
> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
> accessOrder set to true behaves like an LRU cache and uniqueQueries is just
> that. While the keys in uniqueQueries are Query instances, the values are
> also Query instances and not DocIdSet corresponding to the matches as we
> might expect. This LRU cache is common across all segments and the purpose
> of this cache (map) is to store a singleton corresponding to the most
> recently seen queries so that we don’t need to store multiple copies of the
> same query. cache: The cache maps a LeafCache instance for each leaf in the
> index represented by a CacheKey. The LeafCache isn’t a cache but is a
> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
> instance from uniqueQueries and CacheAndCount is the actual data structure
> that stores the DocIdSet for the matches found for the query in that leaf.
>
> Returning to the creation of the BulkScorer from the CachingWrapperWeight,
> we first register the query (and hence its frequency) on the policy via the
> onUse method. At present, Lucene only wants to cache queries on leaves
> which have more than 10k documents and have more than 3% of the total
> number of documents in the index. Once we determine that the segment is
> eligible for caching, we use a CacheHelper to get the CacheKey for the
> segment so that we can access the LeafCache corresponding to it. We check
> if the query has a CacheAndCount entry in the LeafCache and return it (a
> cache hit!) or return null (a cache miss). In the case of a cache miss, we
> use the policy’s shouldCache to determine if this query is cache-eligible
> and if it is, we create a CacheAndCount instance for the query-leaf pair.
> We add the Query to uniqueQueries and query and CacheAndCount instance to
> the corresponding LeafCache for that leaf. We’ve finally cached a query and
> its match set!
>
> When creating a CacheAndCount instance for a query that is being newly
> cached, we want to find the matches for that query. We use the actual
> weight for that query (the one wrapped inside the CachingWrapperWeight) to
> create a BulkScorer and then call score on that scorer to populate a
> LeafCollector with the matches for the query in this leaf. Based on the
> density of the result set, we create a DocIdSet over a FixedBitSet or a
> Roaring bit set.
>
> Finally, we create a DocIdSetIterator from the retrieved or created
> CacheAndCount instance and return the most commonly used BulkScorer - the
> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
> DocIdSetIterator. When there is a cache miss and the query is ineligible
> for caching, we return the BulkScorer from the actual weight wrapped inside
> the CachingWrapperWeight, as we would if no caching logic were present!
>
> Thanks,
> Shradha
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



--
Adrien
Re: Lucene's LRU Query Cache - Deep Dive [ In reply to ]
Thanks for the deep-dive Shradha. Thank you Adrien for the additional questions and answers.

I had a couple of questions, when looking around the cache code.

1. The `QueryCachingPolicy` [1] makes decisions based on `Query`. Why not use `Weight`?
The `scorerSupplier` [2] in the `LRUQueryCache` decides whether something should be cached by determining the cost [3] using the `Weight`. IIUC, this was introduced because “Query caching leads to absurdly slow queries” [4].
What if the `QueryCachingPolicy` was called with the `Weight` instead? Since the `Query` can be obtained from the `Weight`, we can have all such caching decisions in the policy, and de-couple that decision from the `LRUQueryCache` class. What do you think?

2. Why do we invoke a possibly costly `cost()` method for every cache addition?
During the above cost computation, we call the `supplier.cost()` method; but its documentation [5] states that it “may be a costly operation, so it should only be called if necessary".
This means that we’re including a (possibly) costly operation for every cache addition. If we de-couple these, then, for cases where the cache addition is expensive, we can use the call to `cost`, but for other cases, we can avoid this expensive call.

If you, or the community thinks that this is a good idea, then I can open a JIRA, and submit a PR.

References:
[1] https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java>
[2] https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725>
[3] https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767>
[4] https://github.com/apache/lucene-solr/pull/940/files <https://github.com/apache/lucene-solr/pull/940/files>
[5] https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40 <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40>


Regards,
Mohammad Sadiq


> On 11 Jul 2022, at 10:37, Adrien Grand <jpountz@gmail.com> wrote:
>
> Hey Shradha,
>
> This correctly describes the what, but I think it could add more color
> about why the cache behaves this way to be more useful, e.g.
> - Why doesn't the cache cache all queries? Lucene is relatively good at
> evaluating a subset of the matching documents, e.g. queries sorted by
> numeric field can leverage point index structures to only look at a small
> subset of the matching docs. Yet caching a query requires consuming all its
> matches, so it could significantly hurt latencies. It's important to not
> cache all queries to preserve the benefit of Lucene's filtering and dynamic
> pruning capabilities.
> - A corollary of the above is that the query cache is essentially a way to
> trade latency for throughput. Use-cases that care more about latency than
> throughput may want to disable the cache entirely.
> - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
> impact of query caching on latency by not caching clauses whose cost is
> much higher than the overall query. It only helps for filters within
> conjunctions though, not in the dynamic pruning case when we don't know how
> many matches are going to be consumed.
> - Why are small segments never cached? Small segments are likely going to
> be merged soon, so it would be wasteful to build cache entries that would
> get evicted shortly.
> - The queries that never get cached don't get cached because a cached
> entry wouldn't perform faster than their uncached counterpart. An inverted
> index is already a cache of the matches for every unique term of the index.
>
> On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2612@gmail.com>
> wrote:
>
>> Hello!
>>
>> I work at Amazon Product Search and I’ve recently been looking into
>> understanding how Lucene’s LRU Query Cache works. I’ve written up a summary
>> of my understanding here. (Also attached as a markdown file with this email)
>>
>> Will really appreciate feedback/improvements/corrections for my
>> understanding and if this is worthy of contributing to the documentation
>> for LRU QueryCache. :)
>>
>> =================================
>> A brief overview of Lucene's Query Cache
>>
>> We first get acquainted with Lucene’s caching at IndexSearcher’s
>> createWeight method which is called for every query (and consequently
>> sub-queries within that query, eg see BooleanWeight) before we can actually
>> find matching documents in our index and score them. Weight is really
>> another representation of a query that is specific to the statistics of the
>> IndexSearcher being used. This definition makes it easier to see why
>> caching logic would start while creating weight for the query - we want to
>> make a weight that will be responsible for caching matching docs per
>> segment. Since segments are specific to the IndexReader being used by the
>> IndexSearcher, they are transitively, specific to the IndexSearcher.
>>
>> QueryCache in Lucene is an interface that has the signature for just one
>> method - doCache. doCache takes in a Weight (weight of the query in
>> question eg: TermWeight for a TermQuery) and operates on it based on the
>> rules defined by QueryCachingPolicy (yet another interface that defines two
>> methods - onUse and shouldCache) to return a Weight. This “new” returned
>> Weight possibly wraps the original weight and bestows upon it some caching
>> abilities.
>>
>> As of now, Lucene has one core implementation of the QueryCache and the
>> QueryCachingPolicy. All IndexSearcher instances have a default query cache
>> - LRUQueryCache and use the default policy -
>> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight method,
>> we first create a weight for the incoming query and then subject it to the
>> LRUQueryCache’s doCache method. An important thing to note here is that we
>> only call doCache when the score mode passed to the search method does not
>> need scores. Calling doCache does nothing complex; it just returns the
>> input weight wrapped as a CachingWrapperWeight that encapsulates the
>> caching policy information. No real caching has happened, yet!
>>
>> After getting a weight from the createWeight method, the IndexSearcher
>> iterates over each leaf and uses the weight to create a BulkScorer. A
>> BulkScorer, as the name suggests, is used to score a range of the documents
>> - generally the range being all the matches found in that leaf. Given
>> context information for a leaf, every weight should know how to create a
>> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
>> BulkScorer method does a little bit extra and this is where the actual
>> caching happens!
>>
>> A brief dive into the query caching policy: While LRU says what we want to
>> evict from a full cache, using query caching policies we can define other
>> rules to use in conjunction with the cache’s design policy. The default
>> UsageTrackingQueryCachingPolicy dictates what queries will be put into the
>> cache. This policy uses a ring buffer data structure optimised to track and
>> retrieve frequencies for a given query. The policy also defines some
>> queries that will not be cached (TermQuery, MatchAllDocsQuery,
>> MatchNoDocsQuery to name a few). The OnUse method on this policy checks if
>> the incoming query is something it would like to cache and if so, it
>> registers the frequency for this query. The UsageTrackingQueryCachingPolicy
>> defines minimum frequencies for queries - 2 for costly queries and 5 for
>> others. When a query that can be cached and has been seen more than minimum
>> number of times, the shouldCache method of the policy gives it a green
>> light for caching.
>>
>> A brief dive into LRUQueryCache: There are many attributes in this class
>> that track ram usage and allowed max memory size for the cache that play
>> important roles but, for easy understanding, we want to highlight two
>> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
>> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
>> accessOrder set to true behaves like an LRU cache and uniqueQueries is just
>> that. While the keys in uniqueQueries are Query instances, the values are
>> also Query instances and not DocIdSet corresponding to the matches as we
>> might expect. This LRU cache is common across all segments and the purpose
>> of this cache (map) is to store a singleton corresponding to the most
>> recently seen queries so that we don’t need to store multiple copies of the
>> same query. cache: The cache maps a LeafCache instance for each leaf in the
>> index represented by a CacheKey. The LeafCache isn’t a cache but is a
>> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
>> instance from uniqueQueries and CacheAndCount is the actual data structure
>> that stores the DocIdSet for the matches found for the query in that leaf.
>>
>> Returning to the creation of the BulkScorer from the CachingWrapperWeight,
>> we first register the query (and hence its frequency) on the policy via the
>> onUse method. At present, Lucene only wants to cache queries on leaves
>> which have more than 10k documents and have more than 3% of the total
>> number of documents in the index. Once we determine that the segment is
>> eligible for caching, we use a CacheHelper to get the CacheKey for the
>> segment so that we can access the LeafCache corresponding to it. We check
>> if the query has a CacheAndCount entry in the LeafCache and return it (a
>> cache hit!) or return null (a cache miss). In the case of a cache miss, we
>> use the policy’s shouldCache to determine if this query is cache-eligible
>> and if it is, we create a CacheAndCount instance for the query-leaf pair.
>> We add the Query to uniqueQueries and query and CacheAndCount instance to
>> the corresponding LeafCache for that leaf. We’ve finally cached a query and
>> its match set!
>>
>> When creating a CacheAndCount instance for a query that is being newly
>> cached, we want to find the matches for that query. We use the actual
>> weight for that query (the one wrapped inside the CachingWrapperWeight) to
>> create a BulkScorer and then call score on that scorer to populate a
>> LeafCollector with the matches for the query in this leaf. Based on the
>> density of the result set, we create a DocIdSet over a FixedBitSet or a
>> Roaring bit set.
>>
>> Finally, we create a DocIdSetIterator from the retrieved or created
>> CacheAndCount instance and return the most commonly used BulkScorer - the
>> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
>> DocIdSetIterator. When there is a cache miss and the query is ineligible
>> for caching, we return the BulkScorer from the actual weight wrapped inside
>> the CachingWrapperWeight, as we would if no caching logic were present!
>>
>> Thanks,
>> Shradha
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> --
> Adrien
Re: Lucene's LRU Query Cache - Deep Dive [ In reply to ]
1. I believe that this would require pulling a ScorerSupplier twice for the
same segment, which is a costly operation.

2. The cost is computed in order to know whether the top-level query is
likely to consume a large-enough portion of the matches of the query that
we are considering caching so that caching this query wouldn't hurt latency
too much. Making a bad decision here because the cost is unknown would lead
to a worse situation than computing the cost on every query that we are
considering caching.

On both of these questions, I feel like I may be missing the point about
the suggestion you are making so feel free to show a simple code change
that could help me understand the change that you are suggesting.

On Thu, Jul 14, 2022 at 12:26 PM Mohammad Sadiq <sadiqlinux@gmail.com>
wrote:

> Thanks for the deep-dive Shradha. Thank you Adrien for the additional
> questions and answers.
>
> I had a couple of questions, when looking around the cache code.
>
> 1. The `QueryCachingPolicy` [1] makes decisions based on `Query`. Why not
> use `Weight`?
> The `scorerSupplier` [2] in the `LRUQueryCache` decides whether something
> should be cached by determining the cost [3] using the `Weight`. IIUC, this
> was introduced because “Query caching leads to absurdly slow queries” [4].
> What if the `QueryCachingPolicy` was called with the `Weight` instead?
> Since the `Query` can be obtained from the `Weight`, we can have all such
> caching decisions in the policy, and de-couple that decision from the
> `LRUQueryCache` class. What do you think?
>
> 2. Why do we invoke a possibly costly `cost()` method for every cache
> addition?
> During the above cost computation, we call the `supplier.cost()` method;
> but its documentation [5] states that it “may be a costly operation, so it
> should only be called if necessary".
> This means that we’re including a (possibly) costly operation for every
> cache addition. If we de-couple these, then, for cases where the cache
> addition is expensive, we can use the call to `cost`, but for other cases,
> we can avoid this expensive call.
>
> If you, or the community thinks that this is a good idea, then I can open
> a JIRA, and submit a PR.
>
> References:
> [1]
> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java
> <
> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java
> >
> [2]
> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725
> <
> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725
> >
> [3]
> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767
> <
> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767
> >
> [4] https://github.com/apache/lucene-solr/pull/940/files <
> https://github.com/apache/lucene-solr/pull/940/files>
> [5]
> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40
> <
> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40
> >
>
>
> Regards,
> Mohammad Sadiq
>
>
> > On 11 Jul 2022, at 10:37, Adrien Grand <jpountz@gmail.com> wrote:
> >
> > Hey Shradha,
> >
> > This correctly describes the what, but I think it could add more color
> > about why the cache behaves this way to be more useful, e.g.
> > - Why doesn't the cache cache all queries? Lucene is relatively good at
> > evaluating a subset of the matching documents, e.g. queries sorted by
> > numeric field can leverage point index structures to only look at a small
> > subset of the matching docs. Yet caching a query requires consuming all
> its
> > matches, so it could significantly hurt latencies. It's important to not
> > cache all queries to preserve the benefit of Lucene's filtering and
> dynamic
> > pruning capabilities.
> > - A corollary of the above is that the query cache is essentially a way
> to
> > trade latency for throughput. Use-cases that care more about latency than
> > throughput may want to disable the cache entirely.
> > - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
> > impact of query caching on latency by not caching clauses whose cost is
> > much higher than the overall query. It only helps for filters within
> > conjunctions though, not in the dynamic pruning case when we don't know
> how
> > many matches are going to be consumed.
> > - Why are small segments never cached? Small segments are likely going to
> > be merged soon, so it would be wasteful to build cache entries that would
> > get evicted shortly.
> > - The queries that never get cached don't get cached because a cached
> > entry wouldn't perform faster than their uncached counterpart. An
> inverted
> > index is already a cache of the matches for every unique term of the
> index.
> >
> > On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2612@gmail.com>
> > wrote:
> >
> >> Hello!
> >>
> >> I work at Amazon Product Search and I’ve recently been looking into
> >> understanding how Lucene’s LRU Query Cache works. I’ve written up a
> summary
> >> of my understanding here. (Also attached as a markdown file with this
> email)
> >>
> >> Will really appreciate feedback/improvements/corrections for my
> >> understanding and if this is worthy of contributing to the documentation
> >> for LRU QueryCache. :)
> >>
> >> =================================
> >> A brief overview of Lucene's Query Cache
> >>
> >> We first get acquainted with Lucene’s caching at IndexSearcher’s
> >> createWeight method which is called for every query (and consequently
> >> sub-queries within that query, eg see BooleanWeight) before we can
> actually
> >> find matching documents in our index and score them. Weight is really
> >> another representation of a query that is specific to the statistics of
> the
> >> IndexSearcher being used. This definition makes it easier to see why
> >> caching logic would start while creating weight for the query - we want
> to
> >> make a weight that will be responsible for caching matching docs per
> >> segment. Since segments are specific to the IndexReader being used by
> the
> >> IndexSearcher, they are transitively, specific to the IndexSearcher.
> >>
> >> QueryCache in Lucene is an interface that has the signature for just one
> >> method - doCache. doCache takes in a Weight (weight of the query in
> >> question eg: TermWeight for a TermQuery) and operates on it based on the
> >> rules defined by QueryCachingPolicy (yet another interface that defines
> two
> >> methods - onUse and shouldCache) to return a Weight. This “new” returned
> >> Weight possibly wraps the original weight and bestows upon it some
> caching
> >> abilities.
> >>
> >> As of now, Lucene has one core implementation of the QueryCache and the
> >> QueryCachingPolicy. All IndexSearcher instances have a default query
> cache
> >> - LRUQueryCache and use the default policy -
> >> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight
> method,
> >> we first create a weight for the incoming query and then subject it to
> the
> >> LRUQueryCache’s doCache method. An important thing to note here is that
> we
> >> only call doCache when the score mode passed to the search method does
> not
> >> need scores. Calling doCache does nothing complex; it just returns the
> >> input weight wrapped as a CachingWrapperWeight that encapsulates the
> >> caching policy information. No real caching has happened, yet!
> >>
> >> After getting a weight from the createWeight method, the IndexSearcher
> >> iterates over each leaf and uses the weight to create a BulkScorer. A
> >> BulkScorer, as the name suggests, is used to score a range of the
> documents
> >> - generally the range being all the matches found in that leaf. Given
> >> context information for a leaf, every weight should know how to create a
> >> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
> >> BulkScorer method does a little bit extra and this is where the actual
> >> caching happens!
> >>
> >> A brief dive into the query caching policy: While LRU says what we want
> to
> >> evict from a full cache, using query caching policies we can define
> other
> >> rules to use in conjunction with the cache’s design policy. The default
> >> UsageTrackingQueryCachingPolicy dictates what queries will be put into
> the
> >> cache. This policy uses a ring buffer data structure optimised to track
> and
> >> retrieve frequencies for a given query. The policy also defines some
> >> queries that will not be cached (TermQuery, MatchAllDocsQuery,
> >> MatchNoDocsQuery to name a few). The OnUse method on this policy checks
> if
> >> the incoming query is something it would like to cache and if so, it
> >> registers the frequency for this query. The
> UsageTrackingQueryCachingPolicy
> >> defines minimum frequencies for queries - 2 for costly queries and 5 for
> >> others. When a query that can be cached and has been seen more than
> minimum
> >> number of times, the shouldCache method of the policy gives it a green
> >> light for caching.
> >>
> >> A brief dive into LRUQueryCache: There are many attributes in this class
> >> that track ram usage and allowed max memory size for the cache that play
> >> important roles but, for easy understanding, we want to highlight two
> >> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
> >> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
> >> accessOrder set to true behaves like an LRU cache and uniqueQueries is
> just
> >> that. While the keys in uniqueQueries are Query instances, the values
> are
> >> also Query instances and not DocIdSet corresponding to the matches as we
> >> might expect. This LRU cache is common across all segments and the
> purpose
> >> of this cache (map) is to store a singleton corresponding to the most
> >> recently seen queries so that we don’t need to store multiple copies of
> the
> >> same query. cache: The cache maps a LeafCache instance for each leaf in
> the
> >> index represented by a CacheKey. The LeafCache isn’t a cache but is a
> >> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
> >> instance from uniqueQueries and CacheAndCount is the actual data
> structure
> >> that stores the DocIdSet for the matches found for the query in that
> leaf.
> >>
> >> Returning to the creation of the BulkScorer from the
> CachingWrapperWeight,
> >> we first register the query (and hence its frequency) on the policy via
> the
> >> onUse method. At present, Lucene only wants to cache queries on leaves
> >> which have more than 10k documents and have more than 3% of the total
> >> number of documents in the index. Once we determine that the segment is
> >> eligible for caching, we use a CacheHelper to get the CacheKey for the
> >> segment so that we can access the LeafCache corresponding to it. We
> check
> >> if the query has a CacheAndCount entry in the LeafCache and return it (a
> >> cache hit!) or return null (a cache miss). In the case of a cache miss,
> we
> >> use the policy’s shouldCache to determine if this query is
> cache-eligible
> >> and if it is, we create a CacheAndCount instance for the query-leaf
> pair.
> >> We add the Query to uniqueQueries and query and CacheAndCount instance
> to
> >> the corresponding LeafCache for that leaf. We’ve finally cached a query
> and
> >> its match set!
> >>
> >> When creating a CacheAndCount instance for a query that is being newly
> >> cached, we want to find the matches for that query. We use the actual
> >> weight for that query (the one wrapped inside the CachingWrapperWeight)
> to
> >> create a BulkScorer and then call score on that scorer to populate a
> >> LeafCollector with the matches for the query in this leaf. Based on the
> >> density of the result set, we create a DocIdSet over a FixedBitSet or a
> >> Roaring bit set.
> >>
> >> Finally, we create a DocIdSetIterator from the retrieved or created
> >> CacheAndCount instance and return the most commonly used BulkScorer -
> the
> >> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
> >> DocIdSetIterator. When there is a cache miss and the query is ineligible
> >> for caching, we return the BulkScorer from the actual weight wrapped
> inside
> >> the CachingWrapperWeight, as we would if no caching logic were present!
> >>
> >> Thanks,
> >> Shradha
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> > --
> > Adrien
>
>

--
Adrien
Re: Lucene's LRU Query Cache - Deep Dive [ In reply to ]
Hey Adrian,

Apologies for the delay in my response.
Thank you so much for the detailed reply and feedback on my write up. I will incorporate your suggestions and revert back with a new iteration soon.

Regards,
Shradha

> On 11 Jul 2022, at 10:37, Adrien Grand <jpountz@gmail.com> wrote:
>
> Hey Shradha,
>
> This correctly describes the what, but I think it could add more color
> about why the cache behaves this way to be more useful, e.g.
> - Why doesn't the cache cache all queries? Lucene is relatively good at
> evaluating a subset of the matching documents, e.g. queries sorted by
> numeric field can leverage point index structures to only look at a small
> subset of the matching docs. Yet caching a query requires consuming all its
> matches, so it could significantly hurt latencies. It's important to not
> cache all queries to preserve the benefit of Lucene's filtering and dynamic
> pruning capabilities.
> - A corollary of the above is that the query cache is essentially a way to
> trade latency for throughput. Use-cases that care more about latency than
> throughput may want to disable the cache entirely.
> - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
> impact of query caching on latency by not caching clauses whose cost is
> much higher than the overall query. It only helps for filters within
> conjunctions though, not in the dynamic pruning case when we don't know how
> many matches are going to be consumed.
> - Why are small segments never cached? Small segments are likely going to
> be merged soon, so it would be wasteful to build cache entries that would
> get evicted shortly.
> - The queries that never get cached don't get cached because a cached
> entry wouldn't perform faster than their uncached counterpart. An inverted
> index is already a cache of the matches for every unique term of the index.
>
> On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2612@gmail.com>
> wrote:
>
>> Hello!
>>
>> I work at Amazon Product Search and I’ve recently been looking into
>> understanding how Lucene’s LRU Query Cache works. I’ve written up a summary
>> of my understanding here. (Also attached as a markdown file with this email)
>>
>> Will really appreciate feedback/improvements/corrections for my
>> understanding and if this is worthy of contributing to the documentation
>> for LRU QueryCache. :)
>>
>> =================================
>> A brief overview of Lucene's Query Cache
>>
>> We first get acquainted with Lucene’s caching at IndexSearcher’s
>> createWeight method which is called for every query (and consequently
>> sub-queries within that query, eg see BooleanWeight) before we can actually
>> find matching documents in our index and score them. Weight is really
>> another representation of a query that is specific to the statistics of the
>> IndexSearcher being used. This definition makes it easier to see why
>> caching logic would start while creating weight for the query - we want to
>> make a weight that will be responsible for caching matching docs per
>> segment. Since segments are specific to the IndexReader being used by the
>> IndexSearcher, they are transitively, specific to the IndexSearcher.
>>
>> QueryCache in Lucene is an interface that has the signature for just one
>> method - doCache. doCache takes in a Weight (weight of the query in
>> question eg: TermWeight for a TermQuery) and operates on it based on the
>> rules defined by QueryCachingPolicy (yet another interface that defines two
>> methods - onUse and shouldCache) to return a Weight. This “new” returned
>> Weight possibly wraps the original weight and bestows upon it some caching
>> abilities.
>>
>> As of now, Lucene has one core implementation of the QueryCache and the
>> QueryCachingPolicy. All IndexSearcher instances have a default query cache
>> - LRUQueryCache and use the default policy -
>> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight method,
>> we first create a weight for the incoming query and then subject it to the
>> LRUQueryCache’s doCache method. An important thing to note here is that we
>> only call doCache when the score mode passed to the search method does not
>> need scores. Calling doCache does nothing complex; it just returns the
>> input weight wrapped as a CachingWrapperWeight that encapsulates the
>> caching policy information. No real caching has happened, yet!
>>
>> After getting a weight from the createWeight method, the IndexSearcher
>> iterates over each leaf and uses the weight to create a BulkScorer. A
>> BulkScorer, as the name suggests, is used to score a range of the documents
>> - generally the range being all the matches found in that leaf. Given
>> context information for a leaf, every weight should know how to create a
>> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
>> BulkScorer method does a little bit extra and this is where the actual
>> caching happens!
>>
>> A brief dive into the query caching policy: While LRU says what we want to
>> evict from a full cache, using query caching policies we can define other
>> rules to use in conjunction with the cache’s design policy. The default
>> UsageTrackingQueryCachingPolicy dictates what queries will be put into the
>> cache. This policy uses a ring buffer data structure optimised to track and
>> retrieve frequencies for a given query. The policy also defines some
>> queries that will not be cached (TermQuery, MatchAllDocsQuery,
>> MatchNoDocsQuery to name a few). The OnUse method on this policy checks if
>> the incoming query is something it would like to cache and if so, it
>> registers the frequency for this query. The UsageTrackingQueryCachingPolicy
>> defines minimum frequencies for queries - 2 for costly queries and 5 for
>> others. When a query that can be cached and has been seen more than minimum
>> number of times, the shouldCache method of the policy gives it a green
>> light for caching.
>>
>> A brief dive into LRUQueryCache: There are many attributes in this class
>> that track ram usage and allowed max memory size for the cache that play
>> important roles but, for easy understanding, we want to highlight two
>> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
>> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
>> accessOrder set to true behaves like an LRU cache and uniqueQueries is just
>> that. While the keys in uniqueQueries are Query instances, the values are
>> also Query instances and not DocIdSet corresponding to the matches as we
>> might expect. This LRU cache is common across all segments and the purpose
>> of this cache (map) is to store a singleton corresponding to the most
>> recently seen queries so that we don’t need to store multiple copies of the
>> same query. cache: The cache maps a LeafCache instance for each leaf in the
>> index represented by a CacheKey. The LeafCache isn’t a cache but is a
>> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
>> instance from uniqueQueries and CacheAndCount is the actual data structure
>> that stores the DocIdSet for the matches found for the query in that leaf.
>>
>> Returning to the creation of the BulkScorer from the CachingWrapperWeight,
>> we first register the query (and hence its frequency) on the policy via the
>> onUse method. At present, Lucene only wants to cache queries on leaves
>> which have more than 10k documents and have more than 3% of the total
>> number of documents in the index. Once we determine that the segment is
>> eligible for caching, we use a CacheHelper to get the CacheKey for the
>> segment so that we can access the LeafCache corresponding to it. We check
>> if the query has a CacheAndCount entry in the LeafCache and return it (a
>> cache hit!) or return null (a cache miss). In the case of a cache miss, we
>> use the policy’s shouldCache to determine if this query is cache-eligible
>> and if it is, we create a CacheAndCount instance for the query-leaf pair.
>> We add the Query to uniqueQueries and query and CacheAndCount instance to
>> the corresponding LeafCache for that leaf. We’ve finally cached a query and
>> its match set!
>>
>> When creating a CacheAndCount instance for a query that is being newly
>> cached, we want to find the matches for that query. We use the actual
>> weight for that query (the one wrapped inside the CachingWrapperWeight) to
>> create a BulkScorer and then call score on that scorer to populate a
>> LeafCollector with the matches for the query in this leaf. Based on the
>> density of the result set, we create a DocIdSet over a FixedBitSet or a
>> Roaring bit set.
>>
>> Finally, we create a DocIdSetIterator from the retrieved or created
>> CacheAndCount instance and return the most commonly used BulkScorer - the
>> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
>> DocIdSetIterator. When there is a cache miss and the query is ineligible
>> for caching, we return the BulkScorer from the actual weight wrapped inside
>> the CachingWrapperWeight, as we would if no caching logic were present!
>>
>> Thanks,
>> Shradha
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> --
> Adrien


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene's LRU Query Cache - Deep Dive [ In reply to ]
Thank you for the explanation Adrien!
Looks like you’re right. When I tried making the code changes, it’s not as straight-forward as I imagined.

Regards,
Mohammad Sadiq


> On 19 Jul 2022, at 23:19, Adrien Grand <jpountz@gmail.com> wrote:
>
> 1. I believe that this would require pulling a ScorerSupplier twice for the
> same segment, which is a costly operation.
>
> 2. The cost is computed in order to know whether the top-level query is
> likely to consume a large-enough portion of the matches of the query that
> we are considering caching so that caching this query wouldn't hurt latency
> too much. Making a bad decision here because the cost is unknown would lead
> to a worse situation than computing the cost on every query that we are
> considering caching.
>
> On both of these questions, I feel like I may be missing the point about
> the suggestion you are making so feel free to show a simple code change
> that could help me understand the change that you are suggesting.
>
> On Thu, Jul 14, 2022 at 12:26 PM Mohammad Sadiq <sadiqlinux@gmail.com <mailto:sadiqlinux@gmail.com>>
> wrote:
>
>> Thanks for the deep-dive Shradha. Thank you Adrien for the additional
>> questions and answers.
>>
>> I had a couple of questions, when looking around the cache code.
>>
>> 1. The `QueryCachingPolicy` [1] makes decisions based on `Query`. Why not
>> use `Weight`?
>> The `scorerSupplier` [2] in the `LRUQueryCache` decides whether something
>> should be cached by determining the cost [3] using the `Weight`. IIUC, this
>> was introduced because “Query caching leads to absurdly slow queries” [4].
>> What if the `QueryCachingPolicy` was called with the `Weight` instead?
>> Since the `Query` can be obtained from the `Weight`, we can have all such
>> caching decisions in the policy, and de-couple that decision from the
>> `LRUQueryCache` class. What do you think?
>>
>> 2. Why do we invoke a possibly costly `cost()` method for every cache
>> addition?
>> During the above cost computation, we call the `supplier.cost()` method;
>> but its documentation [5] states that it “may be a costly operation, so it
>> should only be called if necessary".
>> This means that we’re including a (possibly) costly operation for every
>> cache addition. If we de-couple these, then, for cases where the cache
>> addition is expensive, we can use the call to `cost`, but for other cases,
>> we can avoid this expensive call.
>>
>> If you, or the community thinks that this is a good idea, then I can open
>> a JIRA, and submit a PR.
>>
>> References:
>> [1]
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java
>> <
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/QueryCachingPolicy.java>
>>>
>> [2]
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725>
>> <
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L725>
>>>
>> [3]
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767>
>> <
>> https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767 <https://github.com/apache/lucene/blob/941df98c3f718371af4702c92bf6537739120064/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L767>
>>>
>> [4] https://github.com/apache/lucene-solr/pull/940/files <https://github.com/apache/lucene-solr/pull/940/files> <
>> https://github.com/apache/lucene-solr/pull/940/files <https://github.com/apache/lucene-solr/pull/940/files>>
>> [5]
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40 <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40>
>> <
>> https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40 <https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/search/ScorerSupplier.java#L39-L40>
>>>
>>
>>
>> Regards,
>> Mohammad Sadiq
>>
>>
>>> On 11 Jul 2022, at 10:37, Adrien Grand <jpountz@gmail.com> wrote:
>>>
>>> Hey Shradha,
>>>
>>> This correctly describes the what, but I think it could add more color
>>> about why the cache behaves this way to be more useful, e.g.
>>> - Why doesn't the cache cache all queries? Lucene is relatively good at
>>> evaluating a subset of the matching documents, e.g. queries sorted by
>>> numeric field can leverage point index structures to only look at a small
>>> subset of the matching docs. Yet caching a query requires consuming all
>> its
>>> matches, so it could significantly hurt latencies. It's important to not
>>> cache all queries to preserve the benefit of Lucene's filtering and
>> dynamic
>>> pruning capabilities.
>>> - A corollary of the above is that the query cache is essentially a way
>> to
>>> trade latency for throughput. Use-cases that care more about latency than
>>> throughput may want to disable the cache entirely.
>>> - LRUQueryCache takes a `skipCacheFactor` which aims at limiting the
>>> impact of query caching on latency by not caching clauses whose cost is
>>> much higher than the overall query. It only helps for filters within
>>> conjunctions though, not in the dynamic pruning case when we don't know
>> how
>>> many matches are going to be consumed.
>>> - Why are small segments never cached? Small segments are likely going to
>>> be merged soon, so it would be wasteful to build cache entries that would
>>> get evicted shortly.
>>> - The queries that never get cached don't get cached because a cached
>>> entry wouldn't perform faster than their uncached counterpart. An
>> inverted
>>> index is already a cache of the matches for every unique term of the
>> index.
>>>
>>> On Fri, Jul 8, 2022 at 3:20 PM Shradha Shankar <shrdsha2612@gmail.com>
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> I work at Amazon Product Search and I’ve recently been looking into
>>>> understanding how Lucene’s LRU Query Cache works. I’ve written up a
>> summary
>>>> of my understanding here. (Also attached as a markdown file with this
>> email)
>>>>
>>>> Will really appreciate feedback/improvements/corrections for my
>>>> understanding and if this is worthy of contributing to the documentation
>>>> for LRU QueryCache. :)
>>>>
>>>> =================================
>>>> A brief overview of Lucene's Query Cache
>>>>
>>>> We first get acquainted with Lucene’s caching at IndexSearcher’s
>>>> createWeight method which is called for every query (and consequently
>>>> sub-queries within that query, eg see BooleanWeight) before we can
>> actually
>>>> find matching documents in our index and score them. Weight is really
>>>> another representation of a query that is specific to the statistics of
>> the
>>>> IndexSearcher being used. This definition makes it easier to see why
>>>> caching logic would start while creating weight for the query - we want
>> to
>>>> make a weight that will be responsible for caching matching docs per
>>>> segment. Since segments are specific to the IndexReader being used by
>> the
>>>> IndexSearcher, they are transitively, specific to the IndexSearcher.
>>>>
>>>> QueryCache in Lucene is an interface that has the signature for just one
>>>> method - doCache. doCache takes in a Weight (weight of the query in
>>>> question eg: TermWeight for a TermQuery) and operates on it based on the
>>>> rules defined by QueryCachingPolicy (yet another interface that defines
>> two
>>>> methods - onUse and shouldCache) to return a Weight. This “new” returned
>>>> Weight possibly wraps the original weight and bestows upon it some
>> caching
>>>> abilities.
>>>>
>>>> As of now, Lucene has one core implementation of the QueryCache and the
>>>> QueryCachingPolicy. All IndexSearcher instances have a default query
>> cache
>>>> - LRUQueryCache and use the default policy -
>>>> UsageTrackingQueryCachingPolicy.In the IndexSearcher’s createWeight
>> method,
>>>> we first create a weight for the incoming query and then subject it to
>> the
>>>> LRUQueryCache’s doCache method. An important thing to note here is that
>> we
>>>> only call doCache when the score mode passed to the search method does
>> not
>>>> need scores. Calling doCache does nothing complex; it just returns the
>>>> input weight wrapped as a CachingWrapperWeight that encapsulates the
>>>> caching policy information. No real caching has happened, yet!
>>>>
>>>> After getting a weight from the createWeight method, the IndexSearcher
>>>> iterates over each leaf and uses the weight to create a BulkScorer. A
>>>> BulkScorer, as the name suggests, is used to score a range of the
>> documents
>>>> - generally the range being all the matches found in that leaf. Given
>>>> context information for a leaf, every weight should know how to create a
>>>> bulk scorer for that leaf. In our case, the CachingWrapperWeight’s
>>>> BulkScorer method does a little bit extra and this is where the actual
>>>> caching happens!
>>>>
>>>> A brief dive into the query caching policy: While LRU says what we want
>> to
>>>> evict from a full cache, using query caching policies we can define
>> other
>>>> rules to use in conjunction with the cache’s design policy. The default
>>>> UsageTrackingQueryCachingPolicy dictates what queries will be put into
>> the
>>>> cache. This policy uses a ring buffer data structure optimised to track
>> and
>>>> retrieve frequencies for a given query. The policy also defines some
>>>> queries that will not be cached (TermQuery, MatchAllDocsQuery,
>>>> MatchNoDocsQuery to name a few). The OnUse method on this policy checks
>> if
>>>> the incoming query is something it would like to cache and if so, it
>>>> registers the frequency for this query. The
>> UsageTrackingQueryCachingPolicy
>>>> defines minimum frequencies for queries - 2 for costly queries and 5 for
>>>> others. When a query that can be cached and has been seen more than
>> minimum
>>>> number of times, the shouldCache method of the policy gives it a green
>>>> light for caching.
>>>>
>>>> A brief dive into LRUQueryCache: There are many attributes in this class
>>>> that track ram usage and allowed max memory size for the cache that play
>>>> important roles but, for easy understanding, we want to highlight two
>>>> attributes - Map<IndexReader.CacheKey, LeafCache> cache and Map<Query,
>>>> Query> uniqueQueries. uniqueQueries: A LinkedHashMap with capacity and
>>>> accessOrder set to true behaves like an LRU cache and uniqueQueries is
>> just
>>>> that. While the keys in uniqueQueries are Query instances, the values
>> are
>>>> also Query instances and not DocIdSet corresponding to the matches as we
>>>> might expect. This LRU cache is common across all segments and the
>> purpose
>>>> of this cache (map) is to store a singleton corresponding to the most
>>>> recently seen queries so that we don’t need to store multiple copies of
>> the
>>>> same query. cache: The cache maps a LeafCache instance for each leaf in
>> the
>>>> index represented by a CacheKey. The LeafCache isn’t a cache but is a
>>>> Map<Query, CacheAndCount> where Query corresponds to the a cached Query
>>>> instance from uniqueQueries and CacheAndCount is the actual data
>> structure
>>>> that stores the DocIdSet for the matches found for the query in that
>> leaf.
>>>>
>>>> Returning to the creation of the BulkScorer from the
>> CachingWrapperWeight,
>>>> we first register the query (and hence its frequency) on the policy via
>> the
>>>> onUse method. At present, Lucene only wants to cache queries on leaves
>>>> which have more than 10k documents and have more than 3% of the total
>>>> number of documents in the index. Once we determine that the segment is
>>>> eligible for caching, we use a CacheHelper to get the CacheKey for the
>>>> segment so that we can access the LeafCache corresponding to it. We
>> check
>>>> if the query has a CacheAndCount entry in the LeafCache and return it (a
>>>> cache hit!) or return null (a cache miss). In the case of a cache miss,
>> we
>>>> use the policy’s shouldCache to determine if this query is
>> cache-eligible
>>>> and if it is, we create a CacheAndCount instance for the query-leaf
>> pair.
>>>> We add the Query to uniqueQueries and query and CacheAndCount instance
>> to
>>>> the corresponding LeafCache for that leaf. We’ve finally cached a query
>> and
>>>> its match set!
>>>>
>>>> When creating a CacheAndCount instance for a query that is being newly
>>>> cached, we want to find the matches for that query. We use the actual
>>>> weight for that query (the one wrapped inside the CachingWrapperWeight)
>> to
>>>> create a BulkScorer and then call score on that scorer to populate a
>>>> LeafCollector with the matches for the query in this leaf. Based on the
>>>> density of the result set, we create a DocIdSet over a FixedBitSet or a
>>>> Roaring bit set.
>>>>
>>>> Finally, we create a DocIdSetIterator from the retrieved or created
>>>> CacheAndCount instance and return the most commonly used BulkScorer -
>> the
>>>> DefaultBulkScorer with a ConstantScoreScorer (score set to 0) over the
>>>> DocIdSetIterator. When there is a cache miss and the query is ineligible
>>>> for caching, we return the BulkScorer from the actual weight wrapped
>> inside
>>>> the CachingWrapperWeight, as we would if no caching logic were present!
>>>>
>>>> Thanks,
>>>> Shradha
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>> --
>>> Adrien
>>
>>
>
> --
> Adrien