Mailing List Archive

Question about searcherManager applyAllDeletes parameter and maybeRefresh method
Hi!

Hope you are all doing well! We are using searcherManager (
https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SearcherManager.html)
and get a little confused on applyAllDeletes parameter in the constructor
and the method maybeRefresh(). In our case, we need to search over
refreshed docs before we call writer.Commit(). So we are thinking of
constructing the SearcherManager with applyAllDeletes true. However, we
find there is another method ReferenceManager.maybeRefresh()
<https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/ReferenceManager.html#maybeRefresh-->
where
we refresh instances.

My question is if we periodically call maybeRefresh(), and make sure the
manager is refreshed every time before we acquire a new searcher, do we
still need to set applyAllDeletes to true when constructing the manager?

For example, I wrote a simple test like:

> val manager = new SearcherManager(writer, false /*applyAllDeletes*/,
> false, null)
> writer.deleteDocuments(term) // delete some docs
>
> manager.maybeRefresh()
> val searcher = manager.acquire()

val hits = searcher.search(...)

hits.totalHits shouldBe 0

and the test would pass (deletes are applied)


Sincerely,
Ningshan
Re: Question about searcherManager applyAllDeletes parameter and maybeRefresh method [ In reply to ]
Hi Ningshan,
If you want to make sure the deletes are applied after you call
maybeRefresh() then you need to set the applyAllDeletes to be true.

A bit more details:
The constructor of SearcherManager actually internally passes the
applyAllDeletes to the IndexWriter, which then will pass it to the
StandardDirectoryReader here
<https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L524>
Then whenever you make a maybeRefresh call, the reader will try to refresh
itself using the parameters it is init'd with (src
<https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java#L288>
)
So basically the applyAllDeletes you passed into SearcherManager will
affect every call to the maybeRefresh.

Best
Patrick

On Thu, Mar 2, 2023 at 3:03 PM Ningshan Li <ningshan.li@rubrik.com> wrote:

> Hi!
>
> Hope you are all doing well! We are using searcherManager (
>
> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SearcherManager.html
> )
> and get a little confused on applyAllDeletes parameter in the constructor
> and the method maybeRefresh(). In our case, we need to search over
> refreshed docs before we call writer.Commit(). So we are thinking of
> constructing the SearcherManager with applyAllDeletes true. However, we
> find there is another method ReferenceManager.maybeRefresh()
> <
> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/ReferenceManager.html#maybeRefresh--
> >
> where
> we refresh instances.
>
> My question is if we periodically call maybeRefresh(), and make sure the
> manager is refreshed every time before we acquire a new searcher, do we
> still need to set applyAllDeletes to true when constructing the manager?
>
> For example, I wrote a simple test like:
>
> > val manager = new SearcherManager(writer, false /*applyAllDeletes*/,
> > false, null)
> > writer.deleteDocuments(term) // delete some docs
> >
> > manager.maybeRefresh()
> > val searcher = manager.acquire()
>
> val hits = searcher.search(...)
>
> hits.totalHits shouldBe 0
>
> and the test would pass (deletes are applied)
>
>
> Sincerely,
> Ningshan
>
Re: Question about searcherManager applyAllDeletes parameter and maybeRefresh method [ In reply to ]
Hi Patrick,

Thanks for the quick response and the explanation and sources are helpful!
But there is still a point we couldn't quite understand: why did the test I
mentioned earlier pass (applyAllDeletes false and do maybeRefresh())? If
the delete is not applied, we should see the deleted doc in the search
result right?
I even ran it more than 10k times, but I never hit a case where the search
result contains the deleted doc (meaning the delete has not been applied).


Sincerely,
Ningshan

On Thu, Mar 2, 2023 at 3:30?PM Patrick Zhai <zhai7631@gmail.com> wrote:

> Hi Ningshan,
> If you want to make sure the deletes are applied after you call
> maybeRefresh() then you need to set the applyAllDeletes to be true.
>
> A bit more details:
> The constructor of SearcherManager actually internally passes the
> applyAllDeletes to the IndexWriter, which then will pass it to the
> StandardDirectoryReader here
> <https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L524>
> Then whenever you make a maybeRefresh call, the reader will try to refresh
> itself using the parameters it is init'd with (src
> <https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java#L288>
> )
> So basically the applyAllDeletes you passed into SearcherManager will
> affect every call to the maybeRefresh.
>
> Best
> Patrick
>
> On Thu, Mar 2, 2023 at 3:03 PM Ningshan Li <ningshan.li@rubrik.com> wrote:
>
>> Hi!
>>
>> Hope you are all doing well! We are using searcherManager (
>>
>> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SearcherManager.html
>> )
>> and get a little confused on applyAllDeletes parameter in the constructor
>> and the method maybeRefresh(). In our case, we need to search over
>> refreshed docs before we call writer.Commit(). So we are thinking of
>> constructing the SearcherManager with applyAllDeletes true. However, we
>> find there is another method ReferenceManager.maybeRefresh()
>> <
>> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/ReferenceManager.html#maybeRefresh--
>> >
>> where
>> we refresh instances.
>>
>> My question is if we periodically call maybeRefresh(), and make sure the
>> manager is refreshed every time before we acquire a new searcher, do we
>> still need to set applyAllDeletes to true when constructing the manager?
>>
>> For example, I wrote a simple test like:
>>
>> > val manager = new SearcherManager(writer, false /*applyAllDeletes*/,
>> > false, null)
>> > writer.deleteDocuments(term) // delete some docs
>> >
>> > manager.maybeRefresh()
>> > val searcher = manager.acquire()
>>
>> val hits = searcher.search(...)
>>
>> hits.totalHits shouldBe 0
>>
>> and the test would pass (deletes are applied)
>>
>>
>> Sincerely,
>> Ningshan
>>
>

--
[image: RubrikLogo140x4015743589431587568119.png] <https://www.rubrik.com/>
Ningshan Li
Software Engineer at Rubrik
E ningshan.li@rubrik.com Wwww.rubrik.com

[image: Learn More...]
<https://signatures.rubrik.com/uc/61d4c3917829b800215934bd>
Re: Question about searcherManager applyAllDeletes parameter and maybeRefresh method [ In reply to ]
Note that in the javadoc it says
"If false, the deletes may or may not be applied"
means it will not force applying all the delete and it's up to IndexWriter
to decide whether to apply at the refresh time or not,
I'm not 100% sure how IndexWriter decides that and maybe someone knows more
can chime in, but in the unit
test since you're just deleting one doc, it's quite possible that
IndexWriter will apply the delete right away
regardless of what you have passed in.

Hope that helps

Patrick

On Thu, Mar 2, 2023 at 3:50 PM Ningshan Li <ningshan.li@rubrik.com> wrote:

> Hi Patrick,
>
> Thanks for the quick response and the explanation and sources are helpful!
> But there is still a point we couldn't quite understand: why did the test I
> mentioned earlier pass (applyAllDeletes false and do maybeRefresh())? If
> the delete is not applied, we should see the deleted doc in the search
> result right?
> I even ran it more than 10k times, but I never hit a case where the search
> result contains the deleted doc (meaning the delete has not been applied).
>
>
> Sincerely,
> Ningshan
>
> On Thu, Mar 2, 2023 at 3:30?PM Patrick Zhai <zhai7631@gmail.com> wrote:
>
>> Hi Ningshan,
>> If you want to make sure the deletes are applied after you call
>> maybeRefresh() then you need to set the applyAllDeletes to be true.
>>
>> A bit more details:
>> The constructor of SearcherManager actually internally passes the
>> applyAllDeletes to the IndexWriter, which then will pass it to the
>> StandardDirectoryReader here
>> <https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L524>
>> Then whenever you make a maybeRefresh call, the reader will try to
>> refresh itself using the parameters it is init'd with (src
>> <https://github.com/apache/lucene-solr/blob/branch_7_4/lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java#L288>
>> )
>> So basically the applyAllDeletes you passed into SearcherManager will
>> affect every call to the maybeRefresh.
>>
>> Best
>> Patrick
>>
>> On Thu, Mar 2, 2023 at 3:03 PM Ningshan Li <ningshan.li@rubrik.com>
>> wrote:
>>
>>> Hi!
>>>
>>> Hope you are all doing well! We are using searcherManager (
>>>
>>> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/SearcherManager.html
>>> )
>>> and get a little confused on applyAllDeletes parameter in the constructor
>>> and the method maybeRefresh(). In our case, we need to search over
>>> refreshed docs before we call writer.Commit(). So we are thinking of
>>> constructing the SearcherManager with applyAllDeletes true. However, we
>>> find there is another method ReferenceManager.maybeRefresh()
>>> <
>>> https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/ReferenceManager.html#maybeRefresh--
>>> >
>>> where
>>> we refresh instances.
>>>
>>> My question is if we periodically call maybeRefresh(), and make sure the
>>> manager is refreshed every time before we acquire a new searcher, do we
>>> still need to set applyAllDeletes to true when constructing the manager?
>>>
>>> For example, I wrote a simple test like:
>>>
>>> > val manager = new SearcherManager(writer, false /*applyAllDeletes*/,
>>> > false, null)
>>> > writer.deleteDocuments(term) // delete some docs
>>> >
>>> > manager.maybeRefresh()
>>> > val searcher = manager.acquire()
>>>
>>> val hits = searcher.search(...)
>>>
>>> hits.totalHits shouldBe 0
>>>
>>> and the test would pass (deletes are applied)
>>>
>>>
>>> Sincerely,
>>> Ningshan
>>>
>>
>
> --
> [image: RubrikLogo140x4015743589431587568119.png]
> <https://www.rubrik.com/> Ningshan Li
> Software Engineer at Rubrik
> E ningshan.li@rubrik.com Wwww.rubrik.com
>
> [image: Learn More...]
> <https://signatures.rubrik.com/uc/61d4c3917829b800215934bd>
>