Mailing List Archive

Regarding field cache
Hi,

I would like to know if there is any automatic eviction policy for the
field cache entries. I understand that it gets invalidated when a new
searcher opens. But, my question is in case if gc runs or if there is any
other scenario which could evict the unused entries from fieldcache.

Please help to clarify the same.

Thanks
Poorna
Re: Regarding field cache [ In reply to ]
Hi,

They get evicted when the segment of that index is closed. After that
theres no reference to them anymore through a
WeakHashMap<LeaveReader,Cache> and thecache object gets freed by GC.
This happens on refresh of searcher where unused segments are closed and
new ones are openend. There is no way to get rid of entries on a live
searcher.

FieldCache is no longer available since Lucene 6, so which version are
you using? Since Lucene 4 it is better to use DocValues fields for
sorting or facetting/aggregations.

If you are using Solr, theres still a clone of FieldCache as part of
Solr's codebase (and is not supported by Lucene anymore), but thats only
for legacy indexes where the schema was not updated to use DocValues. In
an "ideally configured Solr server", the Admin UI shows no entries below
Core's FieldCache stats. If you see entries there go and replace those
field's config by adding docvalues=true.

Uwe

Am 08.06.2022 um 15:26 schrieb Poorna Murali:
> Hi,
>
> I would like to know if there is any automatic eviction policy for the
> field cache entries. I understand that it gets invalidated when a new
> searcher opens. But, my question is in case if gc runs or if there is any
> other scenario which could evict the unused entries from fieldcache.
>
> Please help to clarify the same.
>
> Thanks
> Poorna
>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Re: Regarding field cache [ In reply to ]
Thanks Uwe! New searcher opens when we do a commit.Apart from this, are
there other scenarios where a searcher would be refreshed?

On 2022/06/08 16:43:07 Uwe Schindler wrote:
> Hi,
>
> They get evicted when the segment of that index is closed. After that
> theres no reference to them anymore through a
> WeakHashMap<LeaveReader,Cache> and thecache object gets freed by GC.
> This happens on refresh of searcher where unused segments are closed and
> new ones are openend. There is no way to get rid of entries on a live
> searcher.
>
> FieldCache is no longer available since Lucene 6, so which version are
> you using? Since Lucene 4 it is better to use DocValues fields for
> sorting or facetting/aggregations.
>
> If you are using Solr, theres still a clone of FieldCache as part of
> Solr's codebase (and is not supported by Lucene anymore), but thats only
> for legacy indexes where the schema was not updated to use DocValues. In
> an "ideally configured Solr server", the Admin UI shows no entries below
> Core's FieldCache stats. If you see entries there go and replace those
> field's config by adding docvalues=true.
>
> Uwe
>
> Am 08.06.2022 um 15:26 schrieb Poorna Murali:
> > Hi,
> >
> > I would like to know if there is any automatic eviction policy for the
> > field cache entries. I understand that it gets invalidated when a new
> > searcher opens. But, my question is in case if gc runs or if there is
any
> > other scenario which could evict the unused entries from fieldcache.
> >
> > Please help to clarify the same.
> >
> > Thanks
> > Poorna
> >
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Regarding field cache [ In reply to ]
Hi,

You do not neessarily need a commit. If you use SearcherManager in
combination with NRTCachingDirectory you can also refresh you searcher
every few seconds, so in-memory cached segments are searched. But in
short: If you do not explicitly ask for a fresh searcher, there won't be
any automatic refreshes and the caches stays as is.

It is also important for older Lucene versions that still support
FieldCache: Make sure your queries work "per segment" and not globally.
Because wrongly written applications would have a cache entry per index
and not per segment, so on every refresh the whole cache has to be
rebuilt. This is also one reason why DocValues are preferred for sorting.

https://2012.berlinbuzzwords.de/sessions/your-index-reader-really-atomic-or-maybe-slow.html

https://www.youtube.com/watch?v=iZZ1AbJ6dik

Uwe

Am 08.06.2022 um 19:05 schrieb Poorna Murali:
> Thanks Uwe! New searcher opens when we do a commit.Apart from this, are
> there other scenarios where a searcher would be refreshed?
>
> On 2022/06/08 16:43:07 Uwe Schindler wrote:
>> Hi,
>>
>> They get evicted when the segment of that index is closed. After that
>> theres no reference to them anymore through a
>> WeakHashMap<LeaveReader,Cache> and thecache object gets freed by GC.
>> This happens on refresh of searcher where unused segments are closed and
>> new ones are openend. There is no way to get rid of entries on a live
>> searcher.
>>
>> FieldCache is no longer available since Lucene 6, so which version are
>> you using? Since Lucene 4 it is better to use DocValues fields for
>> sorting or facetting/aggregations.
>>
>> If you are using Solr, theres still a clone of FieldCache as part of
>> Solr's codebase (and is not supported by Lucene anymore), but thats only
>> for legacy indexes where the schema was not updated to use DocValues. In
>> an "ideally configured Solr server", the Admin UI shows no entries below
>> Core's FieldCache stats. If you see entries there go and replace those
>> field's config by adding docvalues=true.
>>
>> Uwe
>>
>> Am 08.06.2022 um 15:26 schrieb Poorna Murali:
>>> Hi,
>>>
>>> I would like to know if there is any automatic eviction policy for the
>>> field cache entries. I understand that it gets invalidated when a new
>>> searcher opens. But, my question is in case if gc runs or if there is
> any
>>> other scenario which could evict the unused entries from fieldcache.
>>>
>>> Please help to clarify the same.
>>>
>>> Thanks
>>> Poorna
>>>
>> --
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Re: Regarding field cache [ In reply to ]
Thanks Uwe for the details. In our solr (8.4)configuration , we have a
fieldcache that has the fields used for sorting. It can be observed that
the fieldCache is getting cleared sometimes. But, I do not think we have
the below mentioned search manager logic implemented in our setup. We have
not modified any solr/lucene implementation.

So, without opening or refreshing a searcher, I am not able to understand
how the field cache is getting cleared.
Can you please help to clarify this.

On 2022/06/08 17:46:50 Uwe Schindler wrote:
> Hi,
>
> You do not neessarily need a commit. If you use SearcherManager in
> combination with NRTCachingDirectory you can also refresh you searcher
> every few seconds, so in-memory cached segments are searched. But in
> short: If you do not explicitly ask for a fresh searcher, there won't be
> any automatic refreshes and the caches stays as is.
>
> It is also important for older Lucene versions that still support
> FieldCache: Make sure your queries work "per segment" and not globally.
> Because wrongly written applications would have a cache entry per index
> and not per segment, so on every refresh the whole cache has to be
> rebuilt. This is also one reason why DocValues are preferred for sorting.
>
>
https://2012.berlinbuzzwords.de/sessions/your-index-reader-really-atomic-or-maybe-slow.html
>
> https://www.youtube.com/watch?v=iZZ1AbJ6dik
>
> Uwe
>
> Am 08.06.2022 um 19:05 schrieb Poorna Murali:
> > Thanks Uwe! New searcher opens when we do a commit.Apart from this, are
> > there other scenarios where a searcher would be refreshed?
> >
> > On 2022/06/08 16:43:07 Uwe Schindler wrote:
> >> Hi,
> >>
> >> They get evicted when the segment of that index is closed. After that
> >> theres no reference to them anymore through a
> >> WeakHashMap<LeaveReader,Cache> and thecache object gets freed by GC.
> >> This happens on refresh of searcher where unused segments are closed
and
> >> new ones are openend. There is no way to get rid of entries on a live
> >> searcher.
> >>
> >> FieldCache is no longer available since Lucene 6, so which version are
> >> you using? Since Lucene 4 it is better to use DocValues fields for
> >> sorting or facetting/aggregations.
> >>
> >> If you are using Solr, theres still a clone of FieldCache as part of
> >> Solr's codebase (and is not supported by Lucene anymore), but thats
only
> >> for legacy indexes where the schema was not updated to use DocValues.
In
> >> an "ideally configured Solr server", the Admin UI shows no entries
below
> >> Core's FieldCache stats. If you see entries there go and replace those
> >> field's config by adding docvalues=true.
> >>
> >> Uwe
> >>
> >> Am 08.06.2022 um 15:26 schrieb Poorna Murali:
> >>> Hi,
> >>>
> >>> I would like to know if there is any automatic eviction policy for the
> >>> field cache entries. I understand that it gets invalidated when a new
> >>> searcher opens. But, my question is in case if gc runs or if there is
> > any
> >>> other scenario which could evict the unused entries from fieldcache.
> >>>
> >>> Please help to clarify the same.
> >>>
> >>> Thanks
> >>> Poorna
> >>>
> >> --
> >> Uwe Schindler
> >> Achterdiek 19, D-28357 Bremen
> >> https://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Regarding field cache [ In reply to ]
Hi,

As mentioned before. Since Lucene 6 theres no FieldCache in Lucene
anymore so this is the wrong Mailinglist to ask those questions. Apache
Solr has its own impleentation (a deprecated legacy copy of the old
Lucene implementation). Solr uses kind of SeacherManager, so the field
cache gets updated after every soft-commit (which is what is refreshing
in SearcherManager is doing). After a refresh entries may disappear and
new ones may appear. Also when you reload cores, all entries dissapear.

As said before -- IMPORTANT: FieldCache entries inside the Solr Admin UI
cache statistics are a sign of a bad index configuration! Watch the
video from 2012 where this was discussed for the first time. To fix
this, make a list of all field names appearing there and change those
fields in your schema to docValues=true (+ reindex or force-merge after
the change). After that the FieldCache stats will be empty. FieldCache
is a legacy  mechanism and is no longer supported so you should really
get rid of it.

Uwe

Am 08.06.2022 um 20:18 schrieb Poorna Murali:
> Thanks Uwe for the details. In our solr (8.4)configuration , we have a
> fieldcache that has the fields used for sorting. It can be observed that
> the fieldCache is getting cleared sometimes. But, I do not think we have
> the below mentioned search manager logic implemented in our setup. We have
> not modified any solr/lucene implementation.
>
> So, without opening or refreshing a searcher, I am not able to understand
> how the field cache is getting cleared.
> Can you please help to clarify this.
>
> On 2022/06/08 17:46:50 Uwe Schindler wrote:
>> Hi,
>>
>> You do not neessarily need a commit. If you use SearcherManager in
>> combination with NRTCachingDirectory you can also refresh you searcher
>> every few seconds, so in-memory cached segments are searched. But in
>> short: If you do not explicitly ask for a fresh searcher, there won't be
>> any automatic refreshes and the caches stays as is.
>>
>> It is also important for older Lucene versions that still support
>> FieldCache: Make sure your queries work "per segment" and not globally.
>> Because wrongly written applications would have a cache entry per index
>> and not per segment, so on every refresh the whole cache has to be
>> rebuilt. This is also one reason why DocValues are preferred for sorting.
>>
>>
> https://2012.berlinbuzzwords.de/sessions/your-index-reader-really-atomic-or-maybe-slow.html
>> https://www.youtube.com/watch?v=iZZ1AbJ6dik
>>
>> Uwe
>>
>> Am 08.06.2022 um 19:05 schrieb Poorna Murali:
>>> Thanks Uwe! New searcher opens when we do a commit.Apart from this, are
>>> there other scenarios where a searcher would be refreshed?
>>>
>>> On 2022/06/08 16:43:07 Uwe Schindler wrote:
>>>> Hi,
>>>>
>>>> They get evicted when the segment of that index is closed. After that
>>>> theres no reference to them anymore through a
>>>> WeakHashMap<LeaveReader,Cache> and thecache object gets freed by GC.
>>>> This happens on refresh of searcher where unused segments are closed
> and
>>>> new ones are openend. There is no way to get rid of entries on a live
>>>> searcher.
>>>>
>>>> FieldCache is no longer available since Lucene 6, so which version are
>>>> you using? Since Lucene 4 it is better to use DocValues fields for
>>>> sorting or facetting/aggregations.
>>>>
>>>> If you are using Solr, theres still a clone of FieldCache as part of
>>>> Solr's codebase (and is not supported by Lucene anymore), but thats
> only
>>>> for legacy indexes where the schema was not updated to use DocValues.
> In
>>>> an "ideally configured Solr server", the Admin UI shows no entries
> below
>>>> Core's FieldCache stats. If you see entries there go and replace those
>>>> field's config by adding docvalues=true.
>>>>
>>>> Uwe
>>>>
>>>> Am 08.06.2022 um 15:26 schrieb Poorna Murali:
>>>>> Hi,
>>>>>
>>>>> I would like to know if there is any automatic eviction policy for the
>>>>> field cache entries. I understand that it gets invalidated when a new
>>>>> searcher opens. But, my question is in case if gc runs or if there is
>>> any
>>>>> other scenario which could evict the unused entries from fieldcache.
>>>>>
>>>>> Please help to clarify the same.
>>>>>
>>>>> Thanks
>>>>> Poorna
>>>>>
>>>> --
>>>> Uwe Schindler
>>>> Achterdiek 19, D-28357 Bremen
>>>> https://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>> --
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org