Mailing List Archive

Re: [jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud
I made some comments on the PR.

> On Jul 24, 2020, at 4:03 PM, David Eric Pugh (Jira) <jira@apache.org> wrote:
>
>
> [ https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164612#comment-17164612 ]
>
> David Eric Pugh commented on SOLR-14581:
> ----------------------------------------
>
> [~bvd][~erickerickson][~dsmiley]I've pushed up a PR for this, I'll give it till Monday and then merge: https://github.com/apache/lucene-solr/pull/1692
>
> I went with David's specific wording.
>
>> Document the way auto commits work in SolrCloud
>> -----------------------------------------------
>>
>> Key: SOLR-14581
>> URL: https://issues.apache.org/jira/browse/SOLR-14581
>> Project: Solr
>> Issue Type: Bug
>> Components: documentation, SolrCloud
>> Affects Versions: master (9.0)
>> Reporter: Bram Van Dam
>> Assignee: David Eric Pugh
>> Priority: Minor
>> Attachments: SOLR-14581.patch
>>
>> Time Spent: 10m
>> Remaining Estimate: 0h
>>
>> The documentation is unclear about how auto commits actually work in SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening.
>> Erick's reply verbatim:
>> {quote}Each node has its own timer that starts when it receives an update.
>> So in your situation, 60 seconds after any give replica gets it’s first
>> update, all documents that have been received in the interval will
>> be committed.
>> But note several things:
>> 1> commits will tend to cluster for a given shard. By that I mean
>> they’ll tend to happen within a few milliseconds of each other
>> ‘cause it doesn’t take that long for an update to get from the
>> leader to all the followers.
>> 2> this is per replica. So if you host replicas from multiple collections
>> on some node, their commits have no relation to each other. And
>> say for some reason you transmit exactly one document that lands
>> on shard1. Further, say nodeA contains replicas for shard1 and shard2.
>> Only the replica for shard1 would commit.
>> 3> Solr promises eventual consistency. In this case, due to all the
>> timing variables it is not guaranteed that every replica of a single
>> shard has the same document available for search at any given time.
>> Say doc1 hits the leader at time T and a follower at time T+10ms.
>> Say doc2 hits the leader and gets indexed 5ms before the
>> commit is triggered, but for some reason it takes 15ms for it to get
>> to the follower. The leader will be able to search doc2, but the
>> follower won’t until 60 seconds later.{quote}
>> Perhaps the subject deserves a section of its own, but I'll attach a patch which includes the gist of Erick's reply as a Tip in the "indexing in SolrCloud"-section.
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
> For additional commands, e-mail: issues-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org