Tlogs are essential for peer sync, so just because the local commit
was successful doesn’t mean the tlog can be safely removed in
SolrCloud because some other replica can get the docs replayed
from this replica’s tlog and not have to sync the entire index.
They’re also essential to replaying the tlog if somebody, say,
did a kill -9 before a commit happened.
That said…
I question whether the current peer sync is all that useful,
since the default is to keep 100 docs for peer sync and under any
kind of significant indexing load, by the time replica1 is asked
to to peer sync, chances are that it will have long ago flushed past
100 docs and fall back to a full sync.
That said… We’ve talked about a read-only index. WDYT about some
API call like “ipromiseyouweredoneindexing” that would do something like:
1> commit with an fsync
2> purges all tlogs
Since the code can’t know that you’re done indexing you’d need some kind
of external reassurance...
> On Nov 6, 2020, at 10:11 AM, Dawid Weiss <dawid.weiss@gmail.com> wrote:
>
> Thanks David. When you index lots of data that pending tlog can be
> megabytes large... if it's a one-off (no more
> documents will ever be indexed) then this looks strange like hell and
> takes up VM disk.
>
> Dawid
>
> On Fri, Nov 6, 2020 at 3:49 PM David Smiley <dsmiley@apache.org> wrote:
>>
>> AFAIK this is normal. They will rotate, however. Send more documents with a commit=true, and the oldest tlog will go away. I think there's always one tlog around, even when everything is committed. It ought to be improved but it's not a big problem.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Mon, Aug 17, 2020 at 3:37 AM Dawid Weiss <dawid.weiss@gmail.com> wrote:
>>>
>>> Hi Erick,
>>>
>>>> Does it rotate? I.e. is there a new one after every commit?
>>>
>>> The "last" one after bulk-import of documents doesn't. Any commit
>>> command seems to be ignored.
>>>
>>>> If you have steps to repro I can take a look.
>>>
>>> It is vanilla distribution Solr. I'll see if I can provide a repro if
>>> I can't find out what's causing it. Thanks!
>>>
>>> D.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org