Mailing List Archive

[DISCUSS] SIP-12: Incremental Backup and Restore
Hey all,

This morning I published SIP-12, which proposes an overhaul of Solr's
backup and restore functionality. While the "headline" improvement in
this SIP is a change to do backups incrementally, it bundles in a
number of other improvements as well, including the addition of
corruption checks, APIs to list and delete backups, and stronger
integration points with popular object storage APIs.

The SIP can be found here:
https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore

Please read the SIP description and come back here for discussion. As
the discussion progresses we will update the SIP page with any
outcomes and eventually move things to a VOTE.

Looking forward to hearing your feedback.

Best,

Jason

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Much needed! Thanks for initiating this Jason!

As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.

Jan

> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>
> Hey all,
>
> This morning I published SIP-12, which proposes an overhaul of Solr's
> backup and restore functionality. While the "headline" improvement in
> this SIP is a change to do backups incrementally, it bundles in a
> number of other improvements as well, including the addition of
> corruption checks, APIs to list and delete backups, and stronger
> integration points with popular object storage APIs.
>
> The SIP can be found here:
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>
> Please read the SIP description and come back here for discussion. As
> the discussion progresses we will update the SIP page with any
> outcomes and eventually move things to a VOTE.
>
> Looking forward to hearing your feedback.
>
> Best,
>
> Jason
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Hey Jan, thanks for the review.

I hadn't thought about the V2 API in connection to this work. You're
right though I think - the SIP proposes net-new APIs, so it should add
V2 equivalents at the very least. I'll draft tentative details for
these APIs on the SIP and we can refine things from there.

I'm more up in the air on your specific suggestion to restrict the SIP
changes to these v2 APIs. It is an elegant approach to the
backcompat, and it provides a carrot for v2 adoption - both of which I
like. But it would let users create snapshot-based backups (and keep
us maintaining that code) longer than there's any strict need to. And
users are left on the less-efficient format by default. (By contrast,
the current SIP has snapshot-backup creation being replaced by
incremental-backup creation as soon as the latter is available.). Did
you have a particular lifespan in mind for snapshot-based creation if
we go with this approach?

Jason

On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> Much needed! Thanks for initiating this Jason!
>
> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>
> Jan
>
> > 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >
> > Hey all,
> >
> > This morning I published SIP-12, which proposes an overhaul of Solr's
> > backup and restore functionality. While the "headline" improvement in
> > this SIP is a change to do backups incrementally, it bundles in a
> > number of other improvements as well, including the addition of
> > corruption checks, APIs to list and delete backups, and stronger
> > integration points with popular object storage APIs.
> >
> > The SIP can be found here:
> > https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >
> > Please read the SIP description and come back here for discussion. As
> > the discussion progresses we will update the SIP page with any
> > outcomes and eventually move things to a VOTE.
> >
> > Looking forward to hearing your feedback.
> >
> > Best,
> >
> > Jason
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
>, and implement the new imporved version as a V2-api only, and then deprecate the v1 API?


V2 only please

On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>
> Hey Jan, thanks for the review.
>
> I hadn't thought about the V2 API in connection to this work. You're
> right though I think - the SIP proposes net-new APIs, so it should add
> V2 equivalents at the very least. I'll draft tentative details for
> these APIs on the SIP and we can refine things from there.
>
> I'm more up in the air on your specific suggestion to restrict the SIP
> changes to these v2 APIs. It is an elegant approach to the
> backcompat, and it provides a carrot for v2 adoption - both of which I
> like. But it would let users create snapshot-based backups (and keep
> us maintaining that code) longer than there's any strict need to. And
> users are left on the less-efficient format by default. (By contrast,
> the current SIP has snapshot-backup creation being replaced by
> incremental-backup creation as soon as the latter is available.). Did
> you have a particular lifespan in mind for snapshot-based creation if
> we go with this approach?
>
> Jason
>
> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >
> > Much needed! Thanks for initiating this Jason!
> >
> > As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> >
> > Jan
> >
> > > 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >
> > > Hey all,
> > >
> > > This morning I published SIP-12, which proposes an overhaul of Solr's
> > > backup and restore functionality. While the "headline" improvement in
> > > this SIP is a change to do backups incrementally, it bundles in a
> > > number of other improvements as well, including the addition of
> > > corruption checks, APIs to list and delete backups, and stronger
> > > integration points with popular object storage APIs.
> > >
> > > The SIP can be found here:
> > > https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >
> > > Please read the SIP description and come back here for discussion. As
> > > the discussion progresses we will update the SIP page with any
> > > outcomes and eventually move things to a VOTE.
> > >
> > > Looking forward to hearing your feedback.
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


--
-----------------------------------------------------
Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Hey guys,

Following up to make sure I understand the specifics you're
suggesting. You're proposing that:

1. The brand new backup-related APIs (list-backups and delete-backup)
be added in v2-form only.
2. Tweaks to existing backup-related APIs (create-backup, restore) be
made in V2-form only.
3. All existing v1 backup-related APIs be deprecated and left
unchanged. Incremental backups will not be possible using the v1 API.

I'm not against going this route if there's consensus around it. But
I'm not 100% clear on how it means we don't need to worry about
backcompat. Backup and Restore currently exist as both a v1 and a v2
API - I understand how leaving the v1 APIs untouched (other than
deprecation) frees us of some backcompat concerns there, but we would
still need to make tweaks to the v2 backup/restore APIs and would have
to tread just as carefully there in terms of backcompat, afaict.
Unless Solr's backcompatibility guarantees only cover the v1 API and
leave v2 changes to be made freely? I looked around to see if the v2
APIs had any sort of "experimental" designation, but couldn't find
that clearly stated anywhere. Am I missing something?

Best,

Jason

On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>
> >, and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>
>
> V2 only please
>
> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >
> > Hey Jan, thanks for the review.
> >
> > I hadn't thought about the V2 API in connection to this work. You're
> > right though I think - the SIP proposes net-new APIs, so it should add
> > V2 equivalents at the very least. I'll draft tentative details for
> > these APIs on the SIP and we can refine things from there.
> >
> > I'm more up in the air on your specific suggestion to restrict the SIP
> > changes to these v2 APIs. It is an elegant approach to the
> > backcompat, and it provides a carrot for v2 adoption - both of which I
> > like. But it would let users create snapshot-based backups (and keep
> > us maintaining that code) longer than there's any strict need to. And
> > users are left on the less-efficient format by default. (By contrast,
> > the current SIP has snapshot-backup creation being replaced by
> > incremental-backup creation as soon as the latter is available.). Did
> > you have a particular lifespan in mind for snapshot-based creation if
> > we go with this approach?
> >
> > Jason
> >
> > On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >
> > > Much needed! Thanks for initiating this Jason!
> > >
> > > As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> > >
> > > Jan
> > >
> > > > 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > > >
> > > > Hey all,
> > > >
> > > > This morning I published SIP-12, which proposes an overhaul of Solr's
> > > > backup and restore functionality. While the "headline" improvement in
> > > > this SIP is a change to do backups incrementally, it bundles in a
> > > > number of other improvements as well, including the addition of
> > > > corruption checks, APIs to list and delete backups, and stronger
> > > > integration points with popular object storage APIs.
> > > >
> > > > The SIP can be found here:
> > > > https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > > >
> > > > Please read the SIP description and come back here for discussion. As
> > > > the discussion progresses we will update the SIP page with any
> > > > outcomes and eventually move things to a VOTE.
> > > >
> > > > Looking forward to hearing your feedback.
> > > >
> > > > Best,
> > > >
> > > > Jason
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: dev-help@lucene.apache.org
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Thanks for taking on this effort, Jason. I'll review and suggest more next
year. My initial impression is that any non core functionality should
remain outside Solr core as much as possible. I hope we can leverage
modularity wherever possible.

On Tue, 22 Dec, 2020, 10:34 pm Jason Gerlowski, <gerlowskija@gmail.com>
wrote:

> Hey guys,
>
> Following up to make sure I understand the specifics you're
> suggesting. You're proposing that:
>
> 1. The brand new backup-related APIs (list-backups and delete-backup)
> be added in v2-form only.
> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> made in V2-form only.
> 3. All existing v1 backup-related APIs be deprecated and left
> unchanged. Incremental backups will not be possible using the v1 API.
>
> I'm not against going this route if there's consensus around it. But
> I'm not 100% clear on how it means we don't need to worry about
> backcompat. Backup and Restore currently exist as both a v1 and a v2
> API - I understand how leaving the v1 APIs untouched (other than
> deprecation) frees us of some backcompat concerns there, but we would
> still need to make tweaks to the v2 backup/restore APIs and would have
> to tread just as carefully there in terms of backcompat, afaict.
> Unless Solr's backcompatibility guarantees only cover the v1 API and
> leave v2 changes to be made freely? I looked around to see if the v2
> APIs had any sort of "experimental" designation, but couldn't find
> that clearly stated anywhere. Am I missing something?
>
> Best,
>
> Jason
>
> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> >
> > >, and implement the new imporved version as a V2-api only, and then
> deprecate the v1 API?
> >
> >
> > V2 only please
> >
> > On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
> > >
> > > Hey Jan, thanks for the review.
> > >
> > > I hadn't thought about the V2 API in connection to this work. You're
> > > right though I think - the SIP proposes net-new APIs, so it should add
> > > V2 equivalents at the very least. I'll draft tentative details for
> > > these APIs on the SIP and we can refine things from there.
> > >
> > > I'm more up in the air on your specific suggestion to restrict the SIP
> > > changes to these v2 APIs. It is an elegant approach to the
> > > backcompat, and it provides a carrot for v2 adoption - both of which I
> > > like. But it would let users create snapshot-based backups (and keep
> > > us maintaining that code) longer than there's any strict need to. And
> > > users are left on the less-efficient format by default. (By contrast,
> > > the current SIP has snapshot-backup creation being replaced by
> > > incremental-backup creation as soon as the latter is available.). Did
> > > you have a particular lifespan in mind for snapshot-based creation if
> > > we go with this approach?
> > >
> > > Jason
> > >
> > > On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> > > >
> > > > Much needed! Thanks for initiating this Jason!
> > > >
> > > > As we want to move away from v1 APIs where a HTTP GET is used for
> creation and deletion, would it be an idea to leave the old backup/resotre
> APIs as-is, and implement the new imporved version as a V2-api only, and
> then deprecate the v1 API? Then we don't need to worry about back-compat,
> and we get a head-start on converting the COLLECTION API to v2 style.
> > > >
> > > > Jan
> > > >
> > > > > 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > > > >
> > > > > Hey all,
> > > > >
> > > > > This morning I published SIP-12, which proposes an overhaul of
> Solr's
> > > > > backup and restore functionality. While the "headline"
> improvement in
> > > > > this SIP is a change to do backups incrementally, it bundles in a
> > > > > number of other improvements as well, including the addition of
> > > > > corruption checks, APIs to list and delete backups, and stronger
> > > > > integration points with popular object storage APIs.
> > > > >
> > > > > The SIP can be found here:
> > > > >
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > > > >
> > > > > Please read the SIP description and come back here for
> discussion. As
> > > > > the discussion progresses we will update the SIP page with any
> > > > > outcomes and eventually move things to a VOTE.
> > > > >
> > > > > Looking forward to hearing your feedback.
> > > > >
> > > > > Best,
> > > > >
> > > > > Jason
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: dev-help@lucene.apache.org
> > > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: dev-help@lucene.apache.org
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> >
> > --
> > -----------------------------------------------------
> > Noble Paul
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?

Jan Høydahl

> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>
> ?Hey guys,
>
> Following up to make sure I understand the specifics you're
> suggesting. You're proposing that:
>
> 1. The brand new backup-related APIs (list-backups and delete-backup)
> be added in v2-form only.
> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> made in V2-form only.
> 3. All existing v1 backup-related APIs be deprecated and left
> unchanged. Incremental backups will not be possible using the v1 API.
>
> I'm not against going this route if there's consensus around it. But
> I'm not 100% clear on how it means we don't need to worry about
> backcompat. Backup and Restore currently exist as both a v1 and a v2
> API - I understand how leaving the v1 APIs untouched (other than
> deprecation) frees us of some backcompat concerns there, but we would
> still need to make tweaks to the v2 backup/restore APIs and would have
> to tread just as carefully there in terms of backcompat, afaict.
> Unless Solr's backcompatibility guarantees only cover the v1 API and
> leave v2 changes to be made freely? I looked around to see if the v2
> APIs had any sort of "experimental" designation, but couldn't find
> that clearly stated anywhere. Am I missing something?
>
> Best,
>
> Jason
>
>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>>
>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>>
>>
>> V2 only please
>>
>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>>
>>> Hey Jan, thanks for the review.
>>>
>>> I hadn't thought about the V2 API in connection to this work. You're
>>> right though I think - the SIP proposes net-new APIs, so it should add
>>> V2 equivalents at the very least. I'll draft tentative details for
>>> these APIs on the SIP and we can refine things from there.
>>>
>>> I'm more up in the air on your specific suggestion to restrict the SIP
>>> changes to these v2 APIs. It is an elegant approach to the
>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>>> like. But it would let users create snapshot-based backups (and keep
>>> us maintaining that code) longer than there's any strict need to. And
>>> users are left on the less-efficient format by default. (By contrast,
>>> the current SIP has snapshot-backup creation being replaced by
>>> incremental-backup creation as soon as the latter is available.). Did
>>> you have a particular lifespan in mind for snapshot-based creation if
>>> we go with this approach?
>>>
>>> Jason
>>>
>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>
>>>> Much needed! Thanks for initiating this Jason!
>>>>
>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>>>>
>>>> Jan
>>>>
>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>
>>>>> Hey all,
>>>>>
>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>>>>> backup and restore functionality. While the "headline" improvement in
>>>>> this SIP is a change to do backups incrementally, it bundles in a
>>>>> number of other improvements as well, including the addition of
>>>>> corruption checks, APIs to list and delete backups, and stronger
>>>>> integration points with popular object storage APIs.
>>>>>
>>>>> The SIP can be found here:
>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>>>
>>>>> Please read the SIP description and come back here for discussion. As
>>>>> the discussion progresses we will update the SIP page with any
>>>>> outcomes and eventually move things to a VOTE.
>>>>>
>>>>> Looking forward to hearing your feedback.
>>>>>
>>>>> Best,
>>>>>
>>>>> Jason
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?

Tbh I'm not yet familiar enough with the v2 APIs to understand the
distinction you're making. (Do you have a pointer to something that'd
fill me in?)

To zoom in on "backup" as an example, the v2 API I'm referring to
looks like: /v2/collections" -d '{ "backup-collection":
{"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
And it's included in the v2 "introspect" documentation returned by
this API: /v2/collections/_introspect?command=backup-collection". To
me that looked like a v2 API, but maybe path-aliases are also covered
in the introspect docs.

Jason

On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>
> Jan Høydahl
>
> > 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >
> > ?Hey guys,
> >
> > Following up to make sure I understand the specifics you're
> > suggesting. You're proposing that:
> >
> > 1. The brand new backup-related APIs (list-backups and delete-backup)
> > be added in v2-form only.
> > 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> > made in V2-form only.
> > 3. All existing v1 backup-related APIs be deprecated and left
> > unchanged. Incremental backups will not be possible using the v1 API.
> >
> > I'm not against going this route if there's consensus around it. But
> > I'm not 100% clear on how it means we don't need to worry about
> > backcompat. Backup and Restore currently exist as both a v1 and a v2
> > API - I understand how leaving the v1 APIs untouched (other than
> > deprecation) frees us of some backcompat concerns there, but we would
> > still need to make tweaks to the v2 backup/restore APIs and would have
> > to tread just as carefully there in terms of backcompat, afaict.
> > Unless Solr's backcompatibility guarantees only cover the v1 API and
> > leave v2 changes to be made freely? I looked around to see if the v2
> > APIs had any sort of "experimental" designation, but couldn't find
> > that clearly stated anywhere. Am I missing something?
> >
> > Best,
> >
> > Jason
> >
> >> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> >>
> >>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> >>
> >>
> >> V2 only please
> >>
> >>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >>>
> >>> Hey Jan, thanks for the review.
> >>>
> >>> I hadn't thought about the V2 API in connection to this work. You're
> >>> right though I think - the SIP proposes net-new APIs, so it should add
> >>> V2 equivalents at the very least. I'll draft tentative details for
> >>> these APIs on the SIP and we can refine things from there.
> >>>
> >>> I'm more up in the air on your specific suggestion to restrict the SIP
> >>> changes to these v2 APIs. It is an elegant approach to the
> >>> backcompat, and it provides a carrot for v2 adoption - both of which I
> >>> like. But it would let users create snapshot-based backups (and keep
> >>> us maintaining that code) longer than there's any strict need to. And
> >>> users are left on the less-efficient format by default. (By contrast,
> >>> the current SIP has snapshot-backup creation being replaced by
> >>> incremental-backup creation as soon as the latter is available.). Did
> >>> you have a particular lifespan in mind for snapshot-based creation if
> >>> we go with this approach?
> >>>
> >>> Jason
> >>>
> >>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>
> >>>> Much needed! Thanks for initiating this Jason!
> >>>>
> >>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> >>>>
> >>>> Jan
> >>>>
> >>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>
> >>>>> Hey all,
> >>>>>
> >>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> >>>>> backup and restore functionality. While the "headline" improvement in
> >>>>> this SIP is a change to do backups incrementally, it bundles in a
> >>>>> number of other improvements as well, including the addition of
> >>>>> corruption checks, APIs to list and delete backups, and stronger
> >>>>> integration points with popular object storage APIs.
> >>>>>
> >>>>> The SIP can be found here:
> >>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>>>
> >>>>> Please read the SIP description and come back here for discussion. As
> >>>>> the discussion progresses we will update the SIP page with any
> >>>>> outcomes and eventually move things to a VOTE.
> >>>>>
> >>>>> Looking forward to hearing your feedback.
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>
> >>
> >>
> >> --
> >> -----------------------------------------------------
> >> Noble Paul
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(

Jan Høydahl

> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>
> ?
>>
>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>
> Tbh I'm not yet familiar enough with the v2 APIs to understand the
> distinction you're making. (Do you have a pointer to something that'd
> fill me in?)
>
> To zoom in on "backup" as an example, the v2 API I'm referring to
> looks like: /v2/collections" -d '{ "backup-collection":
> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> And it's included in the v2 "introspect" documentation returned by
> this API: /v2/collections/_introspect?command=backup-collection". To
> me that looked like a v2 API, but maybe path-aliases are also covered
> in the introspect docs.
>
> Jason
>
>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>
>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>>
>> Jan Høydahl
>>
>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>
>>> ?Hey guys,
>>>
>>> Following up to make sure I understand the specifics you're
>>> suggesting. You're proposing that:
>>>
>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>>> be added in v2-form only.
>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>>> made in V2-form only.
>>> 3. All existing v1 backup-related APIs be deprecated and left
>>> unchanged. Incremental backups will not be possible using the v1 API.
>>>
>>> I'm not against going this route if there's consensus around it. But
>>> I'm not 100% clear on how it means we don't need to worry about
>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>>> API - I understand how leaving the v1 APIs untouched (other than
>>> deprecation) frees us of some backcompat concerns there, but we would
>>> still need to make tweaks to the v2 backup/restore APIs and would have
>>> to tread just as carefully there in terms of backcompat, afaict.
>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>>> leave v2 changes to be made freely? I looked around to see if the v2
>>> APIs had any sort of "experimental" designation, but couldn't find
>>> that clearly stated anywhere. Am I missing something?
>>>
>>> Best,
>>>
>>> Jason
>>>
>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>>>>
>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>>>>
>>>>
>>>> V2 only please
>>>>
>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>>>>
>>>>> Hey Jan, thanks for the review.
>>>>>
>>>>> I hadn't thought about the V2 API in connection to this work. You're
>>>>> right though I think - the SIP proposes net-new APIs, so it should add
>>>>> V2 equivalents at the very least. I'll draft tentative details for
>>>>> these APIs on the SIP and we can refine things from there.
>>>>>
>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>>>>> changes to these v2 APIs. It is an elegant approach to the
>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>>>>> like. But it would let users create snapshot-based backups (and keep
>>>>> us maintaining that code) longer than there's any strict need to. And
>>>>> users are left on the less-efficient format by default. (By contrast,
>>>>> the current SIP has snapshot-backup creation being replaced by
>>>>> incremental-backup creation as soon as the latter is available.). Did
>>>>> you have a particular lifespan in mind for snapshot-based creation if
>>>>> we go with this approach?
>>>>>
>>>>> Jason
>>>>>
>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>>>
>>>>>> Much needed! Thanks for initiating this Jason!
>>>>>>
>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>>>>>>
>>>>>> Jan
>>>>>>
>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>>>
>>>>>>> Hey all,
>>>>>>>
>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>>>>>>> backup and restore functionality. While the "headline" improvement in
>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>>>>>>> number of other improvements as well, including the addition of
>>>>>>> corruption checks, APIs to list and delete backups, and stronger
>>>>>>> integration points with popular object storage APIs.
>>>>>>>
>>>>>>> The SIP can be found here:
>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>>>>>
>>>>>>> Please read the SIP description and come back here for discussion. As
>>>>>>> the discussion progresses we will update the SIP page with any
>>>>>>> outcomes and eventually move things to a VOTE.
>>>>>>>
>>>>>>> Looking forward to hearing your feedback.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------
>>>> Noble Paul
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Hey, Happy New Year everybody.

Some SIP updates based on the discussion above:

I added v2 examples for each API to the SIP. Feedback welcome,
especially on the v2 APIs that are net-new to this proposal (namely:
"list backups" and "delete backup").

I've also amended the backcompat/migration section to mention Jan's
suggestion that the "incremental" features be exposed in the v2 API
only. Though it's unclear to me whether that's still something people
want since it turns out that we'll still have backcompat concerns with
the existing v2 backup/restore APIs. So I've held off from
removing/replacing the original plan.

Link for convenience:
https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore

Best,

Jason


On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>
> Jan Høydahl
>
> > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >
> > ?
> >>
> >> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
> >
> > Tbh I'm not yet familiar enough with the v2 APIs to understand the
> > distinction you're making. (Do you have a pointer to something that'd
> > fill me in?)
> >
> > To zoom in on "backup" as an example, the v2 API I'm referring to
> > looks like: /v2/collections" -d '{ "backup-collection":
> > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> > And it's included in the v2 "introspect" documentation returned by
> > this API: /v2/collections/_introspect?command=backup-collection". To
> > me that looked like a v2 API, but maybe path-aliases are also covered
> > in the introspect docs.
> >
> > Jason
> >
> >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>
> >> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
> >>
> >> Jan Høydahl
> >>
> >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>
> >>> ?Hey guys,
> >>>
> >>> Following up to make sure I understand the specifics you're
> >>> suggesting. You're proposing that:
> >>>
> >>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> >>> be added in v2-form only.
> >>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> >>> made in V2-form only.
> >>> 3. All existing v1 backup-related APIs be deprecated and left
> >>> unchanged. Incremental backups will not be possible using the v1 API.
> >>>
> >>> I'm not against going this route if there's consensus around it. But
> >>> I'm not 100% clear on how it means we don't need to worry about
> >>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> >>> API - I understand how leaving the v1 APIs untouched (other than
> >>> deprecation) frees us of some backcompat concerns there, but we would
> >>> still need to make tweaks to the v2 backup/restore APIs and would have
> >>> to tread just as carefully there in terms of backcompat, afaict.
> >>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> >>> leave v2 changes to be made freely? I looked around to see if the v2
> >>> APIs had any sort of "experimental" designation, but couldn't find
> >>> that clearly stated anywhere. Am I missing something?
> >>>
> >>> Best,
> >>>
> >>> Jason
> >>>
> >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> >>>>
> >>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> >>>>
> >>>>
> >>>> V2 only please
> >>>>
> >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >>>>>
> >>>>> Hey Jan, thanks for the review.
> >>>>>
> >>>>> I hadn't thought about the V2 API in connection to this work. You're
> >>>>> right though I think - the SIP proposes net-new APIs, so it should add
> >>>>> V2 equivalents at the very least. I'll draft tentative details for
> >>>>> these APIs on the SIP and we can refine things from there.
> >>>>>
> >>>>> I'm more up in the air on your specific suggestion to restrict the SIP
> >>>>> changes to these v2 APIs. It is an elegant approach to the
> >>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
> >>>>> like. But it would let users create snapshot-based backups (and keep
> >>>>> us maintaining that code) longer than there's any strict need to. And
> >>>>> users are left on the less-efficient format by default. (By contrast,
> >>>>> the current SIP has snapshot-backup creation being replaced by
> >>>>> incremental-backup creation as soon as the latter is available.). Did
> >>>>> you have a particular lifespan in mind for snapshot-based creation if
> >>>>> we go with this approach?
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>>>
> >>>>>> Much needed! Thanks for initiating this Jason!
> >>>>>>
> >>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> >>>>>>
> >>>>>> Jan
> >>>>>>
> >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>>>
> >>>>>>> Hey all,
> >>>>>>>
> >>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> >>>>>>> backup and restore functionality. While the "headline" improvement in
> >>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> >>>>>>> number of other improvements as well, including the addition of
> >>>>>>> corruption checks, APIs to list and delete backups, and stronger
> >>>>>>> integration points with popular object storage APIs.
> >>>>>>>
> >>>>>>> The SIP can be found here:
> >>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>>>>>
> >>>>>>> Please read the SIP description and come back here for discussion. As
> >>>>>>> the discussion progresses we will update the SIP page with any
> >>>>>>> outcomes and eventually move things to a VOTE.
> >>>>>>>
> >>>>>>> Looking forward to hearing your feedback.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> Jason
> >>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -----------------------------------------------------
> >>>> Noble Paul
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Can you explicitly call out in the SIP how it relates to the work done in
SOLR-13608?

On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <gerlowskija@gmail.com>
wrote:

> Hey, Happy New Year everybody.
>
> Some SIP updates based on the discussion above:
>
> I added v2 examples for each API to the SIP. Feedback welcome,
> especially on the v2 APIs that are net-new to this proposal (namely:
> "list backups" and "delete backup").
>
> I've also amended the backcompat/migration section to mention Jan's
> suggestion that the "incremental" features be exposed in the v2 API
> only. Though it's unclear to me whether that's still something people
> want since it turns out that we'll still have backcompat concerns with
> the existing v2 backup/restore APIs. So I've held off from
> removing/replacing the original plan.
>
> Link for convenience:
>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>
> Best,
>
> Jason
>
>
> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >
> > Ok, that’s the one I was looking for, it’s not documented in the backup
> chapter of ref-guide :(
> >
> > Jan Høydahl
> >
> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >
> > > ?
> > >>
> > >> We have a path alias to the old API ... but we don’t have a true v2
> API spec for it, do we?
> > >
> > > Tbh I'm not yet familiar enough with the v2 APIs to understand the
> > > distinction you're making. (Do you have a pointer to something that'd
> > > fill me in?)
> > >
> > > To zoom in on "backup" as an example, the v2 API I'm referring to
> > > looks like: /v2/collections" -d '{ "backup-collection":
> > > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> > > And it's included in the v2 "introspect" documentation returned by
> > > this API: /v2/collections/_introspect?command=backup-collection". To
> > > me that looked like a v2 API, but maybe path-aliases are also covered
> > > in the introspect docs.
> > >
> > > Jason
> > >
> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> > >>
> > >> Actually, don’t think we do have a v2 Backup/Restore API. We have a
> path alias to the old API which takes GET ...&action=backup... but we don’t
> have a true v2 API spec for it, do we? Where is that documented?
> > >>
> > >> Jan Høydahl
> > >>
> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > >>>
> > >>> ?Hey guys,
> > >>>
> > >>> Following up to make sure I understand the specifics you're
> > >>> suggesting. You're proposing that:
> > >>>
> > >>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> > >>> be added in v2-form only.
> > >>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> > >>> made in V2-form only.
> > >>> 3. All existing v1 backup-related APIs be deprecated and left
> > >>> unchanged. Incremental backups will not be possible using the v1
> API.
> > >>>
> > >>> I'm not against going this route if there's consensus around it. But
> > >>> I'm not 100% clear on how it means we don't need to worry about
> > >>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> > >>> API - I understand how leaving the v1 APIs untouched (other than
> > >>> deprecation) frees us of some backcompat concerns there, but we would
> > >>> still need to make tweaks to the v2 backup/restore APIs and would
> have
> > >>> to tread just as carefully there in terms of backcompat, afaict.
> > >>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> > >>> leave v2 changes to be made freely? I looked around to see if the v2
> > >>> APIs had any sort of "experimental" designation, but couldn't find
> > >>> that clearly stated anywhere. Am I missing something?
> > >>>
> > >>> Best,
> > >>>
> > >>> Jason
> > >>>
> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com>
> wrote:
> > >>>>
> > >>>>> , and implement the new imporved version as a V2-api only, and
> then deprecate the v1 API?
> > >>>>
> > >>>>
> > >>>> V2 only please
> > >>>>
> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> > >>>>>
> > >>>>> Hey Jan, thanks for the review.
> > >>>>>
> > >>>>> I hadn't thought about the V2 API in connection to this work.
> You're
> > >>>>> right though I think - the SIP proposes net-new APIs, so it should
> add
> > >>>>> V2 equivalents at the very least. I'll draft tentative details for
> > >>>>> these APIs on the SIP and we can refine things from there.
> > >>>>>
> > >>>>> I'm more up in the air on your specific suggestion to restrict the
> SIP
> > >>>>> changes to these v2 APIs. It is an elegant approach to the
> > >>>>> backcompat, and it provides a carrot for v2 adoption - both of
> which I
> > >>>>> like. But it would let users create snapshot-based backups (and
> keep
> > >>>>> us maintaining that code) longer than there's any strict need to.
> And
> > >>>>> users are left on the less-efficient format by default. (By
> contrast,
> > >>>>> the current SIP has snapshot-backup creation being replaced by
> > >>>>> incremental-backup creation as soon as the latter is available.).
> Did
> > >>>>> you have a particular lifespan in mind for snapshot-based creation
> if
> > >>>>> we go with this approach?
> > >>>>>
> > >>>>> Jason
> > >>>>>
> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> > >>>>>>
> > >>>>>> Much needed! Thanks for initiating this Jason!
> > >>>>>>
> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is used for
> creation and deletion, would it be an idea to leave the old backup/resotre
> APIs as-is, and implement the new imporved version as a V2-api only, and
> then deprecate the v1 API? Then we don't need to worry about back-compat,
> and we get a head-start on converting the COLLECTION API to v2 style.
> > >>>>>>
> > >>>>>> Jan
> > >>>>>>
> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > >>>>>>>
> > >>>>>>> Hey all,
> > >>>>>>>
> > >>>>>>> This morning I published SIP-12, which proposes an overhaul of
> Solr's
> > >>>>>>> backup and restore functionality. While the "headline"
> improvement in
> > >>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> > >>>>>>> number of other improvements as well, including the addition of
> > >>>>>>> corruption checks, APIs to list and delete backups, and stronger
> > >>>>>>> integration points with popular object storage APIs.
> > >>>>>>>
> > >>>>>>> The SIP can be found here:
> > >>>>>>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >>>>>>>
> > >>>>>>> Please read the SIP description and come back here for
> discussion. As
> > >>>>>>> the discussion progresses we will update the SIP page with any
> > >>>>>>> outcomes and eventually move things to a VOTE.
> > >>>>>>>
> > >>>>>>> Looking forward to hearing your feedback.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>>
> > >>>>>>> Jason
> > >>>>>>>
> > >>>>>>>
> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>
> > >>>>>
> > >>>>>
> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> -----------------------------------------------------
> > >>>> Noble Paul
> > >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
This is a very thorough SIP, thank you for spending the time on it, Jason!

I have a few minor questions about points that are unclear to me.

1) If we assume that we cannot overwrite files, how does the manifest file
stay current for incremental backup operations to the same directory?
2) How is the manifest file functionally different from the segments_n and
segments.gen files?
3) Does the maxNumBackups parameter consider incremental backups or only
full backups? What happens if we have a full backup and then N incremental
ones? Do we delete the full backup and convert the oldest incremental one
into a full? I imagine this might be a metadata operation, but then the
concerns from question 1 apply.
4) Do we plan to retrofit HDFS Backup and Local File Backup to use the new
interfaces? I believe we should, but may be willing to accept this as out
of scope.
5) Regarding cloud provider test resources, we can also approach the ASF
Infra team to ask for cloud credits. Can you give rough estimates on what
kind of resourcing would be needed?

I did not examine the new APIs in detail, but they looked fine at a high
level overview. Will probably look again after questions regarding v1/v2
are figured out.

On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:

> Can you explicitly call out in the SIP how it relates to the work done in
> SOLR-13608?
>
> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
>
>> Hey, Happy New Year everybody.
>>
>> Some SIP updates based on the discussion above:
>>
>> I added v2 examples for each API to the SIP. Feedback welcome,
>> especially on the v2 APIs that are net-new to this proposal (namely:
>> "list backups" and "delete backup").
>>
>> I've also amended the backcompat/migration section to mention Jan's
>> suggestion that the "incremental" features be exposed in the v2 API
>> only. Though it's unclear to me whether that's still something people
>> want since it turns out that we'll still have backcompat concerns with
>> the existing v2 backup/restore APIs. So I've held off from
>> removing/replacing the original plan.
>>
>> Link for convenience:
>>
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>
>> Best,
>>
>> Jason
>>
>>
>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com>
>> wrote:
>> >
>> > Ok, that’s the one I was looking for, it’s not documented in the backup
>> chapter of ref-guide :(
>> >
>> > Jan Høydahl
>> >
>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com
>> >:
>> > >
>> > > ?
>> > >>
>> > >> We have a path alias to the old API ... but we don’t have a true v2
>> API spec for it, do we?
>> > >
>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand the
>> > > distinction you're making. (Do you have a pointer to something that'd
>> > > fill me in?)
>> > >
>> > > To zoom in on "backup" as an example, the v2 API I'm referring to
>> > > looks like: /v2/collections" -d '{ "backup-collection":
>> > > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>> > > And it's included in the v2 "introspect" documentation returned by
>> > > this API: /v2/collections/_introspect?command=backup-collection". To
>> > > me that looked like a v2 API, but maybe path-aliases are also covered
>> > > in the introspect docs.
>> > >
>> > > Jason
>> > >
>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com>
>> wrote:
>> > >>
>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We have a
>> path alias to the old API which takes GET ...&action=backup... but we don’t
>> have a true v2 API spec for it, do we? Where is that documented?
>> > >>
>> > >> Jan Høydahl
>> > >>
>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
>> gerlowskija@gmail.com>:
>> > >>>
>> > >>> ?Hey guys,
>> > >>>
>> > >>> Following up to make sure I understand the specifics you're
>> > >>> suggesting. You're proposing that:
>> > >>>
>> > >>> 1. The brand new backup-related APIs (list-backups and
>> delete-backup)
>> > >>> be added in v2-form only.
>> > >>> 2. Tweaks to existing backup-related APIs (create-backup, restore)
>> be
>> > >>> made in V2-form only.
>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
>> > >>> unchanged. Incremental backups will not be possible using the v1
>> API.
>> > >>>
>> > >>> I'm not against going this route if there's consensus around it.
>> But
>> > >>> I'm not 100% clear on how it means we don't need to worry about
>> > >>> backcompat. Backup and Restore currently exist as both a v1 and a
>> v2
>> > >>> API - I understand how leaving the v1 APIs untouched (other than
>> > >>> deprecation) frees us of some backcompat concerns there, but we
>> would
>> > >>> still need to make tweaks to the v2 backup/restore APIs and would
>> have
>> > >>> to tread just as carefully there in terms of backcompat, afaict.
>> > >>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>> > >>> leave v2 changes to be made freely? I looked around to see if the
>> v2
>> > >>> APIs had any sort of "experimental" designation, but couldn't find
>> > >>> that clearly stated anywhere. Am I missing something?
>> > >>>
>> > >>> Best,
>> > >>>
>> > >>> Jason
>> > >>>
>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com>
>> wrote:
>> > >>>>
>> > >>>>> , and implement the new imporved version as a V2-api only, and
>> then deprecate the v1 API?
>> > >>>>
>> > >>>>
>> > >>>> V2 only please
>> > >>>>
>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
>> gerlowskija@gmail.com> wrote:
>> > >>>>>
>> > >>>>> Hey Jan, thanks for the review.
>> > >>>>>
>> > >>>>> I hadn't thought about the V2 API in connection to this work.
>> You're
>> > >>>>> right though I think - the SIP proposes net-new APIs, so it
>> should add
>> > >>>>> V2 equivalents at the very least. I'll draft tentative details
>> for
>> > >>>>> these APIs on the SIP and we can refine things from there.
>> > >>>>>
>> > >>>>> I'm more up in the air on your specific suggestion to restrict
>> the SIP
>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
>> > >>>>> backcompat, and it provides a carrot for v2 adoption - both of
>> which I
>> > >>>>> like. But it would let users create snapshot-based backups (and
>> keep
>> > >>>>> us maintaining that code) longer than there's any strict need
>> to. And
>> > >>>>> users are left on the less-efficient format by default. (By
>> contrast,
>> > >>>>> the current SIP has snapshot-backup creation being replaced by
>> > >>>>> incremental-backup creation as soon as the latter is
>> available.). Did
>> > >>>>> you have a particular lifespan in mind for snapshot-based
>> creation if
>> > >>>>> we go with this approach?
>> > >>>>>
>> > >>>>> Jason
>> > >>>>>
>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <
>> jan.asf@cominvent.com> wrote:
>> > >>>>>>
>> > >>>>>> Much needed! Thanks for initiating this Jason!
>> > >>>>>>
>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is used
>> for creation and deletion, would it be an idea to leave the old
>> backup/resotre APIs as-is, and implement the new imporved version as a
>> V2-api only, and then deprecate the v1 API? Then we don't need to worry
>> about back-compat, and we get a head-start on converting the COLLECTION API
>> to v2 style.
>> > >>>>>>
>> > >>>>>> Jan
>> > >>>>>>
>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
>> gerlowskija@gmail.com>:
>> > >>>>>>>
>> > >>>>>>> Hey all,
>> > >>>>>>>
>> > >>>>>>> This morning I published SIP-12, which proposes an overhaul of
>> Solr's
>> > >>>>>>> backup and restore functionality. While the "headline"
>> improvement in
>> > >>>>>>> this SIP is a change to do backups incrementally, it bundles in
>> a
>> > >>>>>>> number of other improvements as well, including the addition of
>> > >>>>>>> corruption checks, APIs to list and delete backups, and stronger
>> > >>>>>>> integration points with popular object storage APIs.
>> > >>>>>>>
>> > >>>>>>> The SIP can be found here:
>> > >>>>>>>
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> > >>>>>>>
>> > >>>>>>> Please read the SIP description and come back here for
>> discussion. As
>> > >>>>>>> the discussion progresses we will update the SIP page with any
>> > >>>>>>> outcomes and eventually move things to a VOTE.
>> > >>>>>>>
>> > >>>>>>> Looking forward to hearing your feedback.
>> > >>>>>>>
>> > >>>>>>> Best,
>> > >>>>>>>
>> > >>>>>>> Jason
>> > >>>>>>>
>> > >>>>>>>
>> ---------------------------------------------------------------------
>> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> ---------------------------------------------------------------------
>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>>>>>
>> > >>>>>
>> > >>>>>
>> ---------------------------------------------------------------------
>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>>>>
>> > >>>>
>> > >>>>
>> > >>>> --
>> > >>>> -----------------------------------------------------
>> > >>>> Noble Paul
>> > >>>>
>> > >>>>
>> ---------------------------------------------------------------------
>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>>>
>> > >>>
>> > >>>
>> ---------------------------------------------------------------------
>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>>
>> > >>
>> > >> ---------------------------------------------------------------------
>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> > >>
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Thanks for the feedback Mike. I've gotta give any credit to Shalin
though, he wrote most of it before the holiday. He and Dat wrote much
of the code involved as well. I haven't done more than steward things
along so far. As you suggested, I've updated the SIP to mention the
related SOLR-13608 (see the bottom of the "Motivation" section).

As for your questions, I've tried to answer them below.

1. Good catch - it doesn't. The SIP should read that each backup
creates its own manifest files as needed for directories it creates
under the base "location". This way, additional backups can be added
to the same location without needing to modify existing metadata
files. I've updated the SIP to reflect this.

2. The proposed metadata file is a lot like segments_n (in spirit) in
that it has pointers to each index file that comprise an
index/replica. But it differs in that it stores additional
information about each file (checksum, size) separate from the file
itself. It also allows a layer of naming indirection between what
files are named in the storage repository and what name they should be
given upon restoration. This helps to avoid confusion that would
otherwise arise between identically named files when e.g. a shard
leader changes between two incremental backups. (I'll try to expand
on this in the SIP, as it's a bit hard to give the full context here.)

3. My intention was that the 'maxNumBackups' parameter would only
refer to the incremental backups in a given location. This was mostly
informed by the fact that traditional backups today are required to be
1-per-location. (i.e. a backup in 8.6.3 will error out if the
specified directory has files in it.). We could fix that aspect of
traditional backups and find semantics for 'maxNumBackups' that might
include traditional ones, but IMO it'd add complexity and work for a
format that the SIP is trying to replace more broadly anyways.

4. I definitely intended to update LocalFileSystemRepository. I have
code to update HdfsBackupRepository as well, but wasn't quite sure
where that stood since it's currently deprecated. I haven't seen
plans to make it a plugin, but might've just missed those discussions
in other mail. Anyway, I plan to update it but that assumes it's
sticking around in one form or another.

5. Good idea - I didn't realize that was an option. But it would be
really nice if possible. I don't have an estimate on resources. I
expect the need would be relatively small - you could restrict the
tests to running on the nightly runs on ASF's Jenkins unless devs
provide their own (e.g.) s3 creds. But that's just a guess obviously,
and not even in concrete terms.

Thanks again for taking the time to wade through the SIP - really
appreciate the feedback. Hope the answers help!

Best,

Jason

On Tue, Jan 5, 2021 at 11:52 AM Mike Drob <mdrob@mdrob.com> wrote:
>
> This is a very thorough SIP, thank you for spending the time on it, Jason!
>
> I have a few minor questions about points that are unclear to me.
>
> 1) If we assume that we cannot overwrite files, how does the manifest file stay current for incremental backup operations to the same directory?
> 2) How is the manifest file functionally different from the segments_n and segments.gen files?
> 3) Does the maxNumBackups parameter consider incremental backups or only full backups? What happens if we have a full backup and then N incremental ones? Do we delete the full backup and convert the oldest incremental one into a full? I imagine this might be a metadata operation, but then the concerns from question 1 apply.
> 4) Do we plan to retrofit HDFS Backup and Local File Backup to use the new interfaces? I believe we should, but may be willing to accept this as out of scope.
> 5) Regarding cloud provider test resources, we can also approach the ASF Infra team to ask for cloud credits. Can you give rough estimates on what kind of resourcing would be needed?
>
> I did not examine the new APIs in detail, but they looked fine at a high level overview. Will probably look again after questions regarding v1/v2 are figured out.
>
> On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:
>>
>> Can you explicitly call out in the SIP how it relates to the work done in SOLR-13608?
>>
>> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>>
>>> Hey, Happy New Year everybody.
>>>
>>> Some SIP updates based on the discussion above:
>>>
>>> I added v2 examples for each API to the SIP. Feedback welcome,
>>> especially on the v2 APIs that are net-new to this proposal (namely:
>>> "list backups" and "delete backup").
>>>
>>> I've also amended the backcompat/migration section to mention Jan's
>>> suggestion that the "incremental" features be exposed in the v2 API
>>> only. Though it's unclear to me whether that's still something people
>>> want since it turns out that we'll still have backcompat concerns with
>>> the existing v2 backup/restore APIs. So I've held off from
>>> removing/replacing the original plan.
>>>
>>> Link for convenience:
>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>
>>> Best,
>>>
>>> Jason
>>>
>>>
>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>> >
>>> > Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>>> >
>>> > Jan Høydahl
>>> >
>>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>> > >
>>> > > ?
>>> > >>
>>> > >> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>>> > >
>>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand the
>>> > > distinction you're making. (Do you have a pointer to something that'd
>>> > > fill me in?)
>>> > >
>>> > > To zoom in on "backup" as an example, the v2 API I'm referring to
>>> > > looks like: /v2/collections" -d '{ "backup-collection":
>>> > > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>>> > > And it's included in the v2 "introspect" documentation returned by
>>> > > this API: /v2/collections/_introspect?command=backup-collection". To
>>> > > me that looked like a v2 API, but maybe path-aliases are also covered
>>> > > in the introspect docs.
>>> > >
>>> > > Jason
>>> > >
>>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>> > >>
>>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>>> > >>
>>> > >> Jan Høydahl
>>> > >>
>>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>> > >>>
>>> > >>> ?Hey guys,
>>> > >>>
>>> > >>> Following up to make sure I understand the specifics you're
>>> > >>> suggesting. You're proposing that:
>>> > >>>
>>> > >>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>>> > >>> be added in v2-form only.
>>> > >>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>>> > >>> made in V2-form only.
>>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
>>> > >>> unchanged. Incremental backups will not be possible using the v1 API.
>>> > >>>
>>> > >>> I'm not against going this route if there's consensus around it. But
>>> > >>> I'm not 100% clear on how it means we don't need to worry about
>>> > >>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>>> > >>> API - I understand how leaving the v1 APIs untouched (other than
>>> > >>> deprecation) frees us of some backcompat concerns there, but we would
>>> > >>> still need to make tweaks to the v2 backup/restore APIs and would have
>>> > >>> to tread just as carefully there in terms of backcompat, afaict.
>>> > >>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>>> > >>> leave v2 changes to be made freely? I looked around to see if the v2
>>> > >>> APIs had any sort of "experimental" designation, but couldn't find
>>> > >>> that clearly stated anywhere. Am I missing something?
>>> > >>>
>>> > >>> Best,
>>> > >>>
>>> > >>> Jason
>>> > >>>
>>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>>> > >>>>
>>> > >>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>>> > >>>>
>>> > >>>>
>>> > >>>> V2 only please
>>> > >>>>
>>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>> > >>>>>
>>> > >>>>> Hey Jan, thanks for the review.
>>> > >>>>>
>>> > >>>>> I hadn't thought about the V2 API in connection to this work. You're
>>> > >>>>> right though I think - the SIP proposes net-new APIs, so it should add
>>> > >>>>> V2 equivalents at the very least. I'll draft tentative details for
>>> > >>>>> these APIs on the SIP and we can refine things from there.
>>> > >>>>>
>>> > >>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
>>> > >>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>>> > >>>>> like. But it would let users create snapshot-based backups (and keep
>>> > >>>>> us maintaining that code) longer than there's any strict need to. And
>>> > >>>>> users are left on the less-efficient format by default. (By contrast,
>>> > >>>>> the current SIP has snapshot-backup creation being replaced by
>>> > >>>>> incremental-backup creation as soon as the latter is available.). Did
>>> > >>>>> you have a particular lifespan in mind for snapshot-based creation if
>>> > >>>>> we go with this approach?
>>> > >>>>>
>>> > >>>>> Jason
>>> > >>>>>
>>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>> > >>>>>>
>>> > >>>>>> Much needed! Thanks for initiating this Jason!
>>> > >>>>>>
>>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>>> > >>>>>>
>>> > >>>>>> Jan
>>> > >>>>>>
>>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>> > >>>>>>>
>>> > >>>>>>> Hey all,
>>> > >>>>>>>
>>> > >>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>>> > >>>>>>> backup and restore functionality. While the "headline" improvement in
>>> > >>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>>> > >>>>>>> number of other improvements as well, including the addition of
>>> > >>>>>>> corruption checks, APIs to list and delete backups, and stronger
>>> > >>>>>>> integration points with popular object storage APIs.
>>> > >>>>>>>
>>> > >>>>>>> The SIP can be found here:
>>> > >>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>> > >>>>>>>
>>> > >>>>>>> Please read the SIP description and come back here for discussion. As
>>> > >>>>>>> the discussion progresses we will update the SIP page with any
>>> > >>>>>>> outcomes and eventually move things to a VOTE.
>>> > >>>>>>>
>>> > >>>>>>> Looking forward to hearing your feedback.
>>> > >>>>>>>
>>> > >>>>>>> Best,
>>> > >>>>>>>
>>> > >>>>>>> Jason
>>> > >>>>>>>
>>> > >>>>>>> ---------------------------------------------------------------------
>>> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>>>>>>
>>> > >>>>>>
>>> > >>>>>>
>>> > >>>>>> ---------------------------------------------------------------------
>>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>>>>>
>>> > >>>>>
>>> > >>>>> ---------------------------------------------------------------------
>>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>>>>
>>> > >>>>
>>> > >>>>
>>> > >>>> --
>>> > >>>> -----------------------------------------------------
>>> > >>>> Noble Paul
>>> > >>>>
>>> > >>>> ---------------------------------------------------------------------
>>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>>>
>>> > >>>
>>> > >>> ---------------------------------------------------------------------
>>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>>
>>> > >>
>>> > >> ---------------------------------------------------------------------
>>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >>
>>> > >
>>> > > ---------------------------------------------------------------------
>>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: dev-help@lucene.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Thanks Jason! This is great, and a very much needed feature.

> This helps to avoid confusion that would
> otherwise arise between identically named files when e.g. a shard
> leader changes between two incremental backups. (I'll try to expand
> on this in the SIP, as it's a bit hard to give the full context here.)

Thanks, I was wondering the same thing. Maybe it would be good to put an
example of how the file structure of a backup looks like in the backup? and
how the manifest file looks like. As you said, a file with the same name
may refer to different segments created by different cores or the same one
(even if the leader changed, it may be a file from a previous replication).

On Thu, Jan 7, 2021 at 1:20 PM Jason Gerlowski <gerlowskija@gmail.com>
wrote:

> Thanks for the feedback Mike. I've gotta give any credit to Shalin
> though, he wrote most of it before the holiday. He and Dat wrote much
> of the code involved as well. I haven't done more than steward things
> along so far. As you suggested, I've updated the SIP to mention the
> related SOLR-13608 (see the bottom of the "Motivation" section).
>
> As for your questions, I've tried to answer them below.
>
> 1. Good catch - it doesn't. The SIP should read that each backup
> creates its own manifest files as needed for directories it creates
> under the base "location". This way, additional backups can be added
> to the same location without needing to modify existing metadata
> files. I've updated the SIP to reflect this.
>
> 2. The proposed metadata file is a lot like segments_n (in spirit) in
> that it has pointers to each index file that comprise an
> index/replica. But it differs in that it stores additional
> information about each file (checksum, size) separate from the file
> itself. It also allows a layer of naming indirection between what
> files are named in the storage repository and what name they should be
> given upon restoration. This helps to avoid confusion that would
> otherwise arise between identically named files when e.g. a shard
> leader changes between two incremental backups. (I'll try to expand
> on this in the SIP, as it's a bit hard to give the full context here.)
>
> 3. My intention was that the 'maxNumBackups' parameter would only
> refer to the incremental backups in a given location. This was mostly
> informed by the fact that traditional backups today are required to be
> 1-per-location. (i.e. a backup in 8.6.3 will error out if the
> specified directory has files in it.). We could fix that aspect of
> traditional backups and find semantics for 'maxNumBackups' that might
> include traditional ones, but IMO it'd add complexity and work for a
> format that the SIP is trying to replace more broadly anyways.
>
> 4. I definitely intended to update LocalFileSystemRepository. I have
> code to update HdfsBackupRepository as well, but wasn't quite sure
> where that stood since it's currently deprecated. I haven't seen
> plans to make it a plugin, but might've just missed those discussions
> in other mail. Anyway, I plan to update it but that assumes it's
> sticking around in one form or another.
>
> 5. Good idea - I didn't realize that was an option. But it would be
> really nice if possible. I don't have an estimate on resources. I
> expect the need would be relatively small - you could restrict the
> tests to running on the nightly runs on ASF's Jenkins unless devs
> provide their own (e.g.) s3 creds. But that's just a guess obviously,
> and not even in concrete terms.
>
> Thanks again for taking the time to wade through the SIP - really
> appreciate the feedback. Hope the answers help!
>
> Best,
>
> Jason
>
> On Tue, Jan 5, 2021 at 11:52 AM Mike Drob <mdrob@mdrob.com> wrote:
> >
> > This is a very thorough SIP, thank you for spending the time on it,
> Jason!
> >
> > I have a few minor questions about points that are unclear to me.
> >
> > 1) If we assume that we cannot overwrite files, how does the manifest
> file stay current for incremental backup operations to the same directory?
> > 2) How is the manifest file functionally different from the segments_n
> and segments.gen files?
> > 3) Does the maxNumBackups parameter consider incremental backups or only
> full backups? What happens if we have a full backup and then N incremental
> ones? Do we delete the full backup and convert the oldest incremental one
> into a full? I imagine this might be a metadata operation, but then the
> concerns from question 1 apply.
> > 4) Do we plan to retrofit HDFS Backup and Local File Backup to use the
> new interfaces? I believe we should, but may be willing to accept this as
> out of scope.
> > 5) Regarding cloud provider test resources, we can also approach the ASF
> Infra team to ask for cloud credits. Can you give rough estimates on what
> kind of resourcing would be needed?
> >
> > I did not examine the new APIs in detail, but they looked fine at a high
> level overview. Will probably look again after questions regarding v1/v2
> are figured out.
> >
> > On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:
> >>
> >> Can you explicitly call out in the SIP how it relates to the work done
> in SOLR-13608?
> >>
> >> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
> >>>
> >>> Hey, Happy New Year everybody.
> >>>
> >>> Some SIP updates based on the discussion above:
> >>>
> >>> I added v2 examples for each API to the SIP. Feedback welcome,
> >>> especially on the v2 APIs that are net-new to this proposal (namely:
> >>> "list backups" and "delete backup").
> >>>
> >>> I've also amended the backcompat/migration section to mention Jan's
> >>> suggestion that the "incremental" features be exposed in the v2 API
> >>> only. Though it's unclear to me whether that's still something people
> >>> want since it turns out that we'll still have backcompat concerns with
> >>> the existing v2 backup/restore APIs. So I've held off from
> >>> removing/replacing the original plan.
> >>>
> >>> Link for convenience:
> >>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>
> >>> Best,
> >>>
> >>> Jason
> >>>
> >>>
> >>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> >>> >
> >>> > Ok, that’s the one I was looking for, it’s not documented in the
> backup chapter of ref-guide :(
> >>> >
> >>> > Jan Høydahl
> >>> >
> >>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >>> > >
> >>> > > ?
> >>> > >>
> >>> > >> We have a path alias to the old API ... but we don’t have a true
> v2 API spec for it, do we?
> >>> > >
> >>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand the
> >>> > > distinction you're making. (Do you have a pointer to something
> that'd
> >>> > > fill me in?)
> >>> > >
> >>> > > To zoom in on "backup" as an example, the v2 API I'm referring to
> >>> > > looks like: /v2/collections" -d '{ "backup-collection":
> >>> > > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> >>> > > And it's included in the v2 "introspect" documentation returned by
> >>> > > this API: /v2/collections/_introspect?command=backup-collection".
> To
> >>> > > me that looked like a v2 API, but maybe path-aliases are also
> covered
> >>> > > in the introspect docs.
> >>> > >
> >>> > > Jason
> >>> > >
> >>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> >>> > >>
> >>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We have
> a path alias to the old API which takes GET ...&action=backup... but we
> don’t have a true v2 API spec for it, do we? Where is that documented?
> >>> > >>
> >>> > >> Jan Høydahl
> >>> > >>
> >>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >>> > >>>
> >>> > >>> ?Hey guys,
> >>> > >>>
> >>> > >>> Following up to make sure I understand the specifics you're
> >>> > >>> suggesting. You're proposing that:
> >>> > >>>
> >>> > >>> 1. The brand new backup-related APIs (list-backups and
> delete-backup)
> >>> > >>> be added in v2-form only.
> >>> > >>> 2. Tweaks to existing backup-related APIs (create-backup,
> restore) be
> >>> > >>> made in V2-form only.
> >>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
> >>> > >>> unchanged. Incremental backups will not be possible using the
> v1 API.
> >>> > >>>
> >>> > >>> I'm not against going this route if there's consensus around
> it. But
> >>> > >>> I'm not 100% clear on how it means we don't need to worry about
> >>> > >>> backcompat. Backup and Restore currently exist as both a v1 and
> a v2
> >>> > >>> API - I understand how leaving the v1 APIs untouched (other than
> >>> > >>> deprecation) frees us of some backcompat concerns there, but we
> would
> >>> > >>> still need to make tweaks to the v2 backup/restore APIs and
> would have
> >>> > >>> to tread just as carefully there in terms of backcompat, afaict.
> >>> > >>> Unless Solr's backcompatibility guarantees only cover the v1 API
> and
> >>> > >>> leave v2 changes to be made freely? I looked around to see if
> the v2
> >>> > >>> APIs had any sort of "experimental" designation, but couldn't
> find
> >>> > >>> that clearly stated anywhere. Am I missing something?
> >>> > >>>
> >>> > >>> Best,
> >>> > >>>
> >>> > >>> Jason
> >>> > >>>
> >>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <
> noble.paul@gmail.com> wrote:
> >>> > >>>>
> >>> > >>>>> , and implement the new imporved version as a V2-api only, and
> then deprecate the v1 API?
> >>> > >>>>
> >>> > >>>>
> >>> > >>>> V2 only please
> >>> > >>>>
> >>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> >>> > >>>>>
> >>> > >>>>> Hey Jan, thanks for the review.
> >>> > >>>>>
> >>> > >>>>> I hadn't thought about the V2 API in connection to this work.
> You're
> >>> > >>>>> right though I think - the SIP proposes net-new APIs, so it
> should add
> >>> > >>>>> V2 equivalents at the very least. I'll draft tentative
> details for
> >>> > >>>>> these APIs on the SIP and we can refine things from there.
> >>> > >>>>>
> >>> > >>>>> I'm more up in the air on your specific suggestion to restrict
> the SIP
> >>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
> >>> > >>>>> backcompat, and it provides a carrot for v2 adoption - both of
> which I
> >>> > >>>>> like. But it would let users create snapshot-based backups
> (and keep
> >>> > >>>>> us maintaining that code) longer than there's any strict need
> to. And
> >>> > >>>>> users are left on the less-efficient format by default. (By
> contrast,
> >>> > >>>>> the current SIP has snapshot-backup creation being replaced by
> >>> > >>>>> incremental-backup creation as soon as the latter is
> available.). Did
> >>> > >>>>> you have a particular lifespan in mind for snapshot-based
> creation if
> >>> > >>>>> we go with this approach?
> >>> > >>>>>
> >>> > >>>>> Jason
> >>> > >>>>>
> >>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> >>> > >>>>>>
> >>> > >>>>>> Much needed! Thanks for initiating this Jason!
> >>> > >>>>>>
> >>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is used
> for creation and deletion, would it be an idea to leave the old
> backup/resotre APIs as-is, and implement the new imporved version as a
> V2-api only, and then deprecate the v1 API? Then we don't need to worry
> about back-compat, and we get a head-start on converting the COLLECTION API
> to v2 style.
> >>> > >>>>>>
> >>> > >>>>>> Jan
> >>> > >>>>>>
> >>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >>> > >>>>>>>
> >>> > >>>>>>> Hey all,
> >>> > >>>>>>>
> >>> > >>>>>>> This morning I published SIP-12, which proposes an overhaul
> of Solr's
> >>> > >>>>>>> backup and restore functionality. While the "headline"
> improvement in
> >>> > >>>>>>> this SIP is a change to do backups incrementally, it bundles
> in a
> >>> > >>>>>>> number of other improvements as well, including the addition
> of
> >>> > >>>>>>> corruption checks, APIs to list and delete backups, and
> stronger
> >>> > >>>>>>> integration points with popular object storage APIs.
> >>> > >>>>>>>
> >>> > >>>>>>> The SIP can be found here:
> >>> > >>>>>>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>> > >>>>>>>
> >>> > >>>>>>> Please read the SIP description and come back here for
> discussion. As
> >>> > >>>>>>> the discussion progresses we will update the SIP page with
> any
> >>> > >>>>>>> outcomes and eventually move things to a VOTE.
> >>> > >>>>>>>
> >>> > >>>>>>> Looking forward to hearing your feedback.
> >>> > >>>>>>>
> >>> > >>>>>>> Best,
> >>> > >>>>>>>
> >>> > >>>>>>> Jason
> >>> > >>>>>>>
> >>> > >>>>>>>
> ---------------------------------------------------------------------
> >>> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>>>>>>
> >>> > >>>>>>
> >>> > >>>>>>
> >>> > >>>>>>
> ---------------------------------------------------------------------
> >>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>>>>>
> >>> > >>>>>
> >>> > >>>>>
> ---------------------------------------------------------------------
> >>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>>>>
> >>> > >>>>
> >>> > >>>>
> >>> > >>>> --
> >>> > >>>> -----------------------------------------------------
> >>> > >>>> Noble Paul
> >>> > >>>>
> >>> > >>>>
> ---------------------------------------------------------------------
> >>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>>>
> >>> > >>>
> >>> > >>>
> ---------------------------------------------------------------------
> >>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>>
> >>> > >>
> >>> > >>
> ---------------------------------------------------------------------
> >>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >>
> >>> > >
> >>> > >
> ---------------------------------------------------------------------
> >>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > > For additional commands, e-mail: dev-help@lucene.apache.org
> >>> > >
> >>> >
> >>> > ---------------------------------------------------------------------
> >>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> > For additional commands, e-mail: dev-help@lucene.apache.org
> >>> >
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Sure thing. I put together a writeup on the file layout and formats
here: https://cwiki.apache.org/confluence/display/SOLR/Incremental+Backup+File+Format
The details get a little verbose, so I made it a subpage that the
SIP-proper calls out to.

Let me know what you think when you get a chance to read - hopefully
that's sufficient to fill the gap.

Jason

On Thu, Jan 7, 2021 at 8:34 PM Tomás Fernández Löbbe
<tomasflobbe@gmail.com> wrote:
>
> Thanks Jason! This is great, and a very much needed feature.
>
> > This helps to avoid confusion that would
> > otherwise arise between identically named files when e.g. a shard
> > leader changes between two incremental backups. (I'll try to expand
> > on this in the SIP, as it's a bit hard to give the full context here.)
>
> Thanks, I was wondering the same thing. Maybe it would be good to put an example of how the file structure of a backup looks like in the backup? and how the manifest file looks like. As you said, a file with the same name may refer to different segments created by different cores or the same one (even if the leader changed, it may be a file from a previous replication).
>
> On Thu, Jan 7, 2021 at 1:20 PM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>
>> Thanks for the feedback Mike. I've gotta give any credit to Shalin
>> though, he wrote most of it before the holiday. He and Dat wrote much
>> of the code involved as well. I haven't done more than steward things
>> along so far. As you suggested, I've updated the SIP to mention the
>> related SOLR-13608 (see the bottom of the "Motivation" section).
>>
>> As for your questions, I've tried to answer them below.
>>
>> 1. Good catch - it doesn't. The SIP should read that each backup
>> creates its own manifest files as needed for directories it creates
>> under the base "location". This way, additional backups can be added
>> to the same location without needing to modify existing metadata
>> files. I've updated the SIP to reflect this.
>>
>> 2. The proposed metadata file is a lot like segments_n (in spirit) in
>> that it has pointers to each index file that comprise an
>> index/replica. But it differs in that it stores additional
>> information about each file (checksum, size) separate from the file
>> itself. It also allows a layer of naming indirection between what
>> files are named in the storage repository and what name they should be
>> given upon restoration. This helps to avoid confusion that would
>> otherwise arise between identically named files when e.g. a shard
>> leader changes between two incremental backups. (I'll try to expand
>> on this in the SIP, as it's a bit hard to give the full context here.)
>>
>> 3. My intention was that the 'maxNumBackups' parameter would only
>> refer to the incremental backups in a given location. This was mostly
>> informed by the fact that traditional backups today are required to be
>> 1-per-location. (i.e. a backup in 8.6.3 will error out if the
>> specified directory has files in it.). We could fix that aspect of
>> traditional backups and find semantics for 'maxNumBackups' that might
>> include traditional ones, but IMO it'd add complexity and work for a
>> format that the SIP is trying to replace more broadly anyways.
>>
>> 4. I definitely intended to update LocalFileSystemRepository. I have
>> code to update HdfsBackupRepository as well, but wasn't quite sure
>> where that stood since it's currently deprecated. I haven't seen
>> plans to make it a plugin, but might've just missed those discussions
>> in other mail. Anyway, I plan to update it but that assumes it's
>> sticking around in one form or another.
>>
>> 5. Good idea - I didn't realize that was an option. But it would be
>> really nice if possible. I don't have an estimate on resources. I
>> expect the need would be relatively small - you could restrict the
>> tests to running on the nightly runs on ASF's Jenkins unless devs
>> provide their own (e.g.) s3 creds. But that's just a guess obviously,
>> and not even in concrete terms.
>>
>> Thanks again for taking the time to wade through the SIP - really
>> appreciate the feedback. Hope the answers help!
>>
>> Best,
>>
>> Jason
>>
>> On Tue, Jan 5, 2021 at 11:52 AM Mike Drob <mdrob@mdrob.com> wrote:
>> >
>> > This is a very thorough SIP, thank you for spending the time on it, Jason!
>> >
>> > I have a few minor questions about points that are unclear to me.
>> >
>> > 1) If we assume that we cannot overwrite files, how does the manifest file stay current for incremental backup operations to the same directory?
>> > 2) How is the manifest file functionally different from the segments_n and segments.gen files?
>> > 3) Does the maxNumBackups parameter consider incremental backups or only full backups? What happens if we have a full backup and then N incremental ones? Do we delete the full backup and convert the oldest incremental one into a full? I imagine this might be a metadata operation, but then the concerns from question 1 apply.
>> > 4) Do we plan to retrofit HDFS Backup and Local File Backup to use the new interfaces? I believe we should, but may be willing to accept this as out of scope.
>> > 5) Regarding cloud provider test resources, we can also approach the ASF Infra team to ask for cloud credits. Can you give rough estimates on what kind of resourcing would be needed?
>> >
>> > I did not examine the new APIs in detail, but they looked fine at a high level overview. Will probably look again after questions regarding v1/v2 are figured out.
>> >
>> > On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:
>> >>
>> >> Can you explicitly call out in the SIP how it relates to the work done in SOLR-13608?
>> >>
>> >> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>> >>>
>> >>> Hey, Happy New Year everybody.
>> >>>
>> >>> Some SIP updates based on the discussion above:
>> >>>
>> >>> I added v2 examples for each API to the SIP. Feedback welcome,
>> >>> especially on the v2 APIs that are net-new to this proposal (namely:
>> >>> "list backups" and "delete backup").
>> >>>
>> >>> I've also amended the backcompat/migration section to mention Jan's
>> >>> suggestion that the "incremental" features be exposed in the v2 API
>> >>> only. Though it's unclear to me whether that's still something people
>> >>> want since it turns out that we'll still have backcompat concerns with
>> >>> the existing v2 backup/restore APIs. So I've held off from
>> >>> removing/replacing the original plan.
>> >>>
>> >>> Link for convenience:
>> >>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> >>>
>> >>> Best,
>> >>>
>> >>> Jason
>> >>>
>> >>>
>> >>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> >>> >
>> >>> > Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>> >>> >
>> >>> > Jan Høydahl
>> >>> >
>> >>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> >>> > >
>> >>> > > ?
>> >>> > >>
>> >>> > >> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>> >>> > >
>> >>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand the
>> >>> > > distinction you're making. (Do you have a pointer to something that'd
>> >>> > > fill me in?)
>> >>> > >
>> >>> > > To zoom in on "backup" as an example, the v2 API I'm referring to
>> >>> > > looks like: /v2/collections" -d '{ "backup-collection":
>> >>> > > {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>> >>> > > And it's included in the v2 "introspect" documentation returned by
>> >>> > > this API: /v2/collections/_introspect?command=backup-collection". To
>> >>> > > me that looked like a v2 API, but maybe path-aliases are also covered
>> >>> > > in the introspect docs.
>> >>> > >
>> >>> > > Jason
>> >>> > >
>> >>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> >>> > >>
>> >>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>> >>> > >>
>> >>> > >> Jan Høydahl
>> >>> > >>
>> >>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> >>> > >>>
>> >>> > >>> ?Hey guys,
>> >>> > >>>
>> >>> > >>> Following up to make sure I understand the specifics you're
>> >>> > >>> suggesting. You're proposing that:
>> >>> > >>>
>> >>> > >>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>> >>> > >>> be added in v2-form only.
>> >>> > >>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>> >>> > >>> made in V2-form only.
>> >>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
>> >>> > >>> unchanged. Incremental backups will not be possible using the v1 API.
>> >>> > >>>
>> >>> > >>> I'm not against going this route if there's consensus around it. But
>> >>> > >>> I'm not 100% clear on how it means we don't need to worry about
>> >>> > >>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>> >>> > >>> API - I understand how leaving the v1 APIs untouched (other than
>> >>> > >>> deprecation) frees us of some backcompat concerns there, but we would
>> >>> > >>> still need to make tweaks to the v2 backup/restore APIs and would have
>> >>> > >>> to tread just as carefully there in terms of backcompat, afaict.
>> >>> > >>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>> >>> > >>> leave v2 changes to be made freely? I looked around to see if the v2
>> >>> > >>> APIs had any sort of "experimental" designation, but couldn't find
>> >>> > >>> that clearly stated anywhere. Am I missing something?
>> >>> > >>>
>> >>> > >>> Best,
>> >>> > >>>
>> >>> > >>> Jason
>> >>> > >>>
>> >>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>> >>> > >>>>
>> >>> > >>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>> >>> > >>>>
>> >>> > >>>>
>> >>> > >>>> V2 only please
>> >>> > >>>>
>> >>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>> >>> > >>>>>
>> >>> > >>>>> Hey Jan, thanks for the review.
>> >>> > >>>>>
>> >>> > >>>>> I hadn't thought about the V2 API in connection to this work. You're
>> >>> > >>>>> right though I think - the SIP proposes net-new APIs, so it should add
>> >>> > >>>>> V2 equivalents at the very least. I'll draft tentative details for
>> >>> > >>>>> these APIs on the SIP and we can refine things from there.
>> >>> > >>>>>
>> >>> > >>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>> >>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
>> >>> > >>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>> >>> > >>>>> like. But it would let users create snapshot-based backups (and keep
>> >>> > >>>>> us maintaining that code) longer than there's any strict need to. And
>> >>> > >>>>> users are left on the less-efficient format by default. (By contrast,
>> >>> > >>>>> the current SIP has snapshot-backup creation being replaced by
>> >>> > >>>>> incremental-backup creation as soon as the latter is available.). Did
>> >>> > >>>>> you have a particular lifespan in mind for snapshot-based creation if
>> >>> > >>>>> we go with this approach?
>> >>> > >>>>>
>> >>> > >>>>> Jason
>> >>> > >>>>>
>> >>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> >>> > >>>>>>
>> >>> > >>>>>> Much needed! Thanks for initiating this Jason!
>> >>> > >>>>>>
>> >>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>> >>> > >>>>>>
>> >>> > >>>>>> Jan
>> >>> > >>>>>>
>> >>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> >>> > >>>>>>>
>> >>> > >>>>>>> Hey all,
>> >>> > >>>>>>>
>> >>> > >>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>> >>> > >>>>>>> backup and restore functionality. While the "headline" improvement in
>> >>> > >>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>> >>> > >>>>>>> number of other improvements as well, including the addition of
>> >>> > >>>>>>> corruption checks, APIs to list and delete backups, and stronger
>> >>> > >>>>>>> integration points with popular object storage APIs.
>> >>> > >>>>>>>
>> >>> > >>>>>>> The SIP can be found here:
>> >>> > >>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> >>> > >>>>>>>
>> >>> > >>>>>>> Please read the SIP description and come back here for discussion. As
>> >>> > >>>>>>> the discussion progresses we will update the SIP page with any
>> >>> > >>>>>>> outcomes and eventually move things to a VOTE.
>> >>> > >>>>>>>
>> >>> > >>>>>>> Looking forward to hearing your feedback.
>> >>> > >>>>>>>
>> >>> > >>>>>>> Best,
>> >>> > >>>>>>>
>> >>> > >>>>>>> Jason
>> >>> > >>>>>>>
>> >>> > >>>>>>> ---------------------------------------------------------------------
>> >>> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>>>>>>
>> >>> > >>>>>>
>> >>> > >>>>>>
>> >>> > >>>>>> ---------------------------------------------------------------------
>> >>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>>>>>
>> >>> > >>>>>
>> >>> > >>>>> ---------------------------------------------------------------------
>> >>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>>>>
>> >>> > >>>>
>> >>> > >>>>
>> >>> > >>>> --
>> >>> > >>>> -----------------------------------------------------
>> >>> > >>>> Noble Paul
>> >>> > >>>>
>> >>> > >>>> ---------------------------------------------------------------------
>> >>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>>>
>> >>> > >>>
>> >>> > >>> ---------------------------------------------------------------------
>> >>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>>
>> >>> > >>
>> >>> > >> ---------------------------------------------------------------------
>> >>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >>
>> >>> > >
>> >>> > > ---------------------------------------------------------------------
>> >>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> > >
>> >>> >
>> >>> > ---------------------------------------------------------------------
>> >>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >>> >
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Do the shard metadata files list all of the segments that make up the
backup, or only the segments that were uploaded in this incremental update?

On Fri, Jan 8, 2021 at 3:19 PM Jason Gerlowski <gerlowskija@gmail.com>
wrote:

> Sure thing. I put together a writeup on the file layout and formats
> here:
> https://cwiki.apache.org/confluence/display/SOLR/Incremental+Backup+File+Format
> The details get a little verbose, so I made it a subpage that the
> SIP-proper calls out to.
>
> Let me know what you think when you get a chance to read - hopefully
> that's sufficient to fill the gap.
>
> Jason
>
> On Thu, Jan 7, 2021 at 8:34 PM Tomás Fernández Löbbe
> <tomasflobbe@gmail.com> wrote:
> >
> > Thanks Jason! This is great, and a very much needed feature.
> >
> > > This helps to avoid confusion that would
> > > otherwise arise between identically named files when e.g. a shard
> > > leader changes between two incremental backups. (I'll try to expand
> > > on this in the SIP, as it's a bit hard to give the full context here.)
> >
> > Thanks, I was wondering the same thing. Maybe it would be good to put an
> example of how the file structure of a backup looks like in the backup? and
> how the manifest file looks like. As you said, a file with the same name
> may refer to different segments created by different cores or the same one
> (even if the leader changed, it may be a file from a previous replication).
> >
> > On Thu, Jan 7, 2021 at 1:20 PM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
> >>
> >> Thanks for the feedback Mike. I've gotta give any credit to Shalin
> >> though, he wrote most of it before the holiday. He and Dat wrote much
> >> of the code involved as well. I haven't done more than steward things
> >> along so far. As you suggested, I've updated the SIP to mention the
> >> related SOLR-13608 (see the bottom of the "Motivation" section).
> >>
> >> As for your questions, I've tried to answer them below.
> >>
> >> 1. Good catch - it doesn't. The SIP should read that each backup
> >> creates its own manifest files as needed for directories it creates
> >> under the base "location". This way, additional backups can be added
> >> to the same location without needing to modify existing metadata
> >> files. I've updated the SIP to reflect this.
> >>
> >> 2. The proposed metadata file is a lot like segments_n (in spirit) in
> >> that it has pointers to each index file that comprise an
> >> index/replica. But it differs in that it stores additional
> >> information about each file (checksum, size) separate from the file
> >> itself. It also allows a layer of naming indirection between what
> >> files are named in the storage repository and what name they should be
> >> given upon restoration. This helps to avoid confusion that would
> >> otherwise arise between identically named files when e.g. a shard
> >> leader changes between two incremental backups. (I'll try to expand
> >> on this in the SIP, as it's a bit hard to give the full context here.)
> >>
> >> 3. My intention was that the 'maxNumBackups' parameter would only
> >> refer to the incremental backups in a given location. This was mostly
> >> informed by the fact that traditional backups today are required to be
> >> 1-per-location. (i.e. a backup in 8.6.3 will error out if the
> >> specified directory has files in it.). We could fix that aspect of
> >> traditional backups and find semantics for 'maxNumBackups' that might
> >> include traditional ones, but IMO it'd add complexity and work for a
> >> format that the SIP is trying to replace more broadly anyways.
> >>
> >> 4. I definitely intended to update LocalFileSystemRepository. I have
> >> code to update HdfsBackupRepository as well, but wasn't quite sure
> >> where that stood since it's currently deprecated. I haven't seen
> >> plans to make it a plugin, but might've just missed those discussions
> >> in other mail. Anyway, I plan to update it but that assumes it's
> >> sticking around in one form or another.
> >>
> >> 5. Good idea - I didn't realize that was an option. But it would be
> >> really nice if possible. I don't have an estimate on resources. I
> >> expect the need would be relatively small - you could restrict the
> >> tests to running on the nightly runs on ASF's Jenkins unless devs
> >> provide their own (e.g.) s3 creds. But that's just a guess obviously,
> >> and not even in concrete terms.
> >>
> >> Thanks again for taking the time to wade through the SIP - really
> >> appreciate the feedback. Hope the answers help!
> >>
> >> Best,
> >>
> >> Jason
> >>
> >> On Tue, Jan 5, 2021 at 11:52 AM Mike Drob <mdrob@mdrob.com> wrote:
> >> >
> >> > This is a very thorough SIP, thank you for spending the time on it,
> Jason!
> >> >
> >> > I have a few minor questions about points that are unclear to me.
> >> >
> >> > 1) If we assume that we cannot overwrite files, how does the manifest
> file stay current for incremental backup operations to the same directory?
> >> > 2) How is the manifest file functionally different from the
> segments_n and segments.gen files?
> >> > 3) Does the maxNumBackups parameter consider incremental backups or
> only full backups? What happens if we have a full backup and then N
> incremental ones? Do we delete the full backup and convert the oldest
> incremental one into a full? I imagine this might be a metadata operation,
> but then the concerns from question 1 apply.
> >> > 4) Do we plan to retrofit HDFS Backup and Local File Backup to use
> the new interfaces? I believe we should, but may be willing to accept this
> as out of scope.
> >> > 5) Regarding cloud provider test resources, we can also approach the
> ASF Infra team to ask for cloud credits. Can you give rough estimates on
> what kind of resourcing would be needed?
> >> >
> >> > I did not examine the new APIs in detail, but they looked fine at a
> high level overview. Will probably look again after questions regarding
> v1/v2 are figured out.
> >> >
> >> > On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:
> >> >>
> >> >> Can you explicitly call out in the SIP how it relates to the work
> done in SOLR-13608?
> >> >>
> >> >> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> >> >>>
> >> >>> Hey, Happy New Year everybody.
> >> >>>
> >> >>> Some SIP updates based on the discussion above:
> >> >>>
> >> >>> I added v2 examples for each API to the SIP. Feedback welcome,
> >> >>> especially on the v2 APIs that are net-new to this proposal (namely:
> >> >>> "list backups" and "delete backup").
> >> >>>
> >> >>> I've also amended the backcompat/migration section to mention Jan's
> >> >>> suggestion that the "incremental" features be exposed in the v2 API
> >> >>> only. Though it's unclear to me whether that's still something
> people
> >> >>> want since it turns out that we'll still have backcompat concerns
> with
> >> >>> the existing v2 backup/restore APIs. So I've held off from
> >> >>> removing/replacing the original plan.
> >> >>>
> >> >>> Link for convenience:
> >> >>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >> >>>
> >> >>> Best,
> >> >>>
> >> >>> Jason
> >> >>>
> >> >>>
> >> >>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> >> >>> >
> >> >>> > Ok, that’s the one I was looking for, it’s not documented in the
> backup chapter of ref-guide :(
> >> >>> >
> >> >>> > Jan Høydahl
> >> >>> >
> >> >>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >> >>> > >
> >> >>> > > ?
> >> >>> > >>
> >> >>> > >> We have a path alias to the old API ... but we don’t have a
> true v2 API spec for it, do we?
> >> >>> > >
> >> >>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand
> the
> >> >>> > > distinction you're making. (Do you have a pointer to something
> that'd
> >> >>> > > fill me in?)
> >> >>> > >
> >> >>> > > To zoom in on "backup" as an example, the v2 API I'm referring
> to
> >> >>> > > looks like: /v2/collections" -d '{ "backup-collection":
> >> >>> > > {"collection": "books", "name": "asdf3", "location":
> "/tmp/foo"}}'.
> >> >>> > > And it's included in the v2 "introspect" documentation returned
> by
> >> >>> > > this API:
> /v2/collections/_introspect?command=backup-collection". To
> >> >>> > > me that looked like a v2 API, but maybe path-aliases are also
> covered
> >> >>> > > in the introspect docs.
> >> >>> > >
> >> >>> > > Jason
> >> >>> > >
> >> >>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> >> >>> > >>
> >> >>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We
> have a path alias to the old API which takes GET ...&action=backup... but
> we don’t have a true v2 API spec for it, do we? Where is that documented?
> >> >>> > >>
> >> >>> > >> Jan Høydahl
> >> >>> > >>
> >> >>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >> >>> > >>>
> >> >>> > >>> ?Hey guys,
> >> >>> > >>>
> >> >>> > >>> Following up to make sure I understand the specifics you're
> >> >>> > >>> suggesting. You're proposing that:
> >> >>> > >>>
> >> >>> > >>> 1. The brand new backup-related APIs (list-backups and
> delete-backup)
> >> >>> > >>> be added in v2-form only.
> >> >>> > >>> 2. Tweaks to existing backup-related APIs (create-backup,
> restore) be
> >> >>> > >>> made in V2-form only.
> >> >>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
> >> >>> > >>> unchanged. Incremental backups will not be possible using
> the v1 API.
> >> >>> > >>>
> >> >>> > >>> I'm not against going this route if there's consensus around
> it. But
> >> >>> > >>> I'm not 100% clear on how it means we don't need to worry
> about
> >> >>> > >>> backcompat. Backup and Restore currently exist as both a v1
> and a v2
> >> >>> > >>> API - I understand how leaving the v1 APIs untouched (other
> than
> >> >>> > >>> deprecation) frees us of some backcompat concerns there, but
> we would
> >> >>> > >>> still need to make tweaks to the v2 backup/restore APIs and
> would have
> >> >>> > >>> to tread just as carefully there in terms of backcompat,
> afaict.
> >> >>> > >>> Unless Solr's backcompatibility guarantees only cover the v1
> API and
> >> >>> > >>> leave v2 changes to be made freely? I looked around to see
> if the v2
> >> >>> > >>> APIs had any sort of "experimental" designation, but couldn't
> find
> >> >>> > >>> that clearly stated anywhere. Am I missing something?
> >> >>> > >>>
> >> >>> > >>> Best,
> >> >>> > >>>
> >> >>> > >>> Jason
> >> >>> > >>>
> >> >>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <
> noble.paul@gmail.com> wrote:
> >> >>> > >>>>
> >> >>> > >>>>> , and implement the new imporved version as a V2-api only,
> and then deprecate the v1 API?
> >> >>> > >>>>
> >> >>> > >>>>
> >> >>> > >>>> V2 only please
> >> >>> > >>>>
> >> >>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> >> >>> > >>>>>
> >> >>> > >>>>> Hey Jan, thanks for the review.
> >> >>> > >>>>>
> >> >>> > >>>>> I hadn't thought about the V2 API in connection to this
> work. You're
> >> >>> > >>>>> right though I think - the SIP proposes net-new APIs, so it
> should add
> >> >>> > >>>>> V2 equivalents at the very least. I'll draft tentative
> details for
> >> >>> > >>>>> these APIs on the SIP and we can refine things from there.
> >> >>> > >>>>>
> >> >>> > >>>>> I'm more up in the air on your specific suggestion to
> restrict the SIP
> >> >>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
> >> >>> > >>>>> backcompat, and it provides a carrot for v2 adoption - both
> of which I
> >> >>> > >>>>> like. But it would let users create snapshot-based backups
> (and keep
> >> >>> > >>>>> us maintaining that code) longer than there's any strict
> need to. And
> >> >>> > >>>>> users are left on the less-efficient format by default.
> (By contrast,
> >> >>> > >>>>> the current SIP has snapshot-backup creation being replaced
> by
> >> >>> > >>>>> incremental-backup creation as soon as the latter is
> available.). Did
> >> >>> > >>>>> you have a particular lifespan in mind for snapshot-based
> creation if
> >> >>> > >>>>> we go with this approach?
> >> >>> > >>>>>
> >> >>> > >>>>> Jason
> >> >>> > >>>>>
> >> >>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> >> >>> > >>>>>>
> >> >>> > >>>>>> Much needed! Thanks for initiating this Jason!
> >> >>> > >>>>>>
> >> >>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is
> used for creation and deletion, would it be an idea to leave the old
> backup/resotre APIs as-is, and implement the new imporved version as a
> V2-api only, and then deprecate the v1 API? Then we don't need to worry
> about back-compat, and we get a head-start on converting the COLLECTION API
> to v2 style.
> >> >>> > >>>>>>
> >> >>> > >>>>>> Jan
> >> >>> > >>>>>>
> >> >>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> Hey all,
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> This morning I published SIP-12, which proposes an
> overhaul of Solr's
> >> >>> > >>>>>>> backup and restore functionality. While the "headline"
> improvement in
> >> >>> > >>>>>>> this SIP is a change to do backups incrementally, it
> bundles in a
> >> >>> > >>>>>>> number of other improvements as well, including the
> addition of
> >> >>> > >>>>>>> corruption checks, APIs to list and delete backups, and
> stronger
> >> >>> > >>>>>>> integration points with popular object storage APIs.
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> The SIP can be found here:
> >> >>> > >>>>>>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> Please read the SIP description and come back here for
> discussion. As
> >> >>> > >>>>>>> the discussion progresses we will update the SIP page
> with any
> >> >>> > >>>>>>> outcomes and eventually move things to a VOTE.
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> Looking forward to hearing your feedback.
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> Best,
> >> >>> > >>>>>>>
> >> >>> > >>>>>>> Jason
> >> >>> > >>>>>>>
> >> >>> > >>>>>>>
> ---------------------------------------------------------------------
> >> >>> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >>>>>>> For additional commands, e-mail:
> dev-help@lucene.apache.org
> >> >>> > >>>>>>>
> >> >>> > >>>>>>
> >> >>> > >>>>>>
> >> >>> > >>>>>>
> ---------------------------------------------------------------------
> >> >>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >>>>>> For additional commands, e-mail:
> dev-help@lucene.apache.org
> >> >>> > >>>>>>
> >> >>> > >>>>>
> >> >>> > >>>>>
> ---------------------------------------------------------------------
> >> >>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> > >>>>>
> >> >>> > >>>>
> >> >>> > >>>>
> >> >>> > >>>> --
> >> >>> > >>>> -----------------------------------------------------
> >> >>> > >>>> Noble Paul
> >> >>> > >>>>
> >> >>> > >>>>
> ---------------------------------------------------------------------
> >> >>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> > >>>>
> >> >>> > >>>
> >> >>> > >>>
> ---------------------------------------------------------------------
> >> >>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> > >>>
> >> >>> > >>
> >> >>> > >>
> ---------------------------------------------------------------------
> >> >>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> > >>
> >> >>> > >
> >> >>> > >
> ---------------------------------------------------------------------
> >> >>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > > For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> > >
> >> >>> >
> >> >>> >
> ---------------------------------------------------------------------
> >> >>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> > For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>> >
> >> >>>
> >> >>>
> ---------------------------------------------------------------------
> >> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Jason, I know you've seen SOLR-15051 (Shared storage -- BlobDirectory), but
perhaps not a part of the linked document:
https://docs.google.com/document/d/1kjQPK80sLiZJyRjek_Edhokfc5q9S3ISvFRM2_YeL8M/edit#
"ZK Data: Listings" that discusses an aspect of the design that is very
similar to a part of SIP-12. We both want to store checksums and file
lengths. Maybe the approaches should/could be aligned on this? I'll look
at SIP-12 more. A brief browse shows that it seems kind of Git like in
that it has a store of all files with a UUID name. Your proposal did not
discuss how these files are GC'ed. In my BlobDirectory proposal, files are
stored where they are created by their natural name. Files are shared via
ref-counting in the per-directory metadata files (I call "listings") so
that files can be deleted immediately if not shared. To handle race
conditions of a metadata file, BlobDirectory uses ZK instead of requiring
anything of the backing blob store.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 8, 2021 at 4:51 PM Mike Drob <mdrob@apache.org> wrote:

> Do the shard metadata files list all of the segments that make up the
> backup, or only the segments that were uploaded in this incremental update?
>
> On Fri, Jan 8, 2021 at 3:19 PM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
>
>> Sure thing. I put together a writeup on the file layout and formats
>> here:
>> https://cwiki.apache.org/confluence/display/SOLR/Incremental+Backup+File+Format
>> The details get a little verbose, so I made it a subpage that the
>> SIP-proper calls out to.
>>
>> Let me know what you think when you get a chance to read - hopefully
>> that's sufficient to fill the gap.
>>
>> Jason
>>
>> On Thu, Jan 7, 2021 at 8:34 PM Tomás Fernández Löbbe
>> <tomasflobbe@gmail.com> wrote:
>> >
>> > Thanks Jason! This is great, and a very much needed feature.
>> >
>> > > This helps to avoid confusion that would
>> > > otherwise arise between identically named files when e.g. a shard
>> > > leader changes between two incremental backups. (I'll try to expand
>> > > on this in the SIP, as it's a bit hard to give the full context here.)
>> >
>> > Thanks, I was wondering the same thing. Maybe it would be good to put
>> an example of how the file structure of a backup looks like in the backup?
>> and how the manifest file looks like. As you said, a file with the same
>> name may refer to different segments created by different cores or the same
>> one (even if the leader changed, it may be a file from a previous
>> replication).
>> >
>> > On Thu, Jan 7, 2021 at 1:20 PM Jason Gerlowski <gerlowskija@gmail.com>
>> wrote:
>> >>
>> >> Thanks for the feedback Mike. I've gotta give any credit to Shalin
>> >> though, he wrote most of it before the holiday. He and Dat wrote much
>> >> of the code involved as well. I haven't done more than steward things
>> >> along so far. As you suggested, I've updated the SIP to mention the
>> >> related SOLR-13608 (see the bottom of the "Motivation" section).
>> >>
>> >> As for your questions, I've tried to answer them below.
>> >>
>> >> 1. Good catch - it doesn't. The SIP should read that each backup
>> >> creates its own manifest files as needed for directories it creates
>> >> under the base "location". This way, additional backups can be added
>> >> to the same location without needing to modify existing metadata
>> >> files. I've updated the SIP to reflect this.
>> >>
>> >> 2. The proposed metadata file is a lot like segments_n (in spirit) in
>> >> that it has pointers to each index file that comprise an
>> >> index/replica. But it differs in that it stores additional
>> >> information about each file (checksum, size) separate from the file
>> >> itself. It also allows a layer of naming indirection between what
>> >> files are named in the storage repository and what name they should be
>> >> given upon restoration. This helps to avoid confusion that would
>> >> otherwise arise between identically named files when e.g. a shard
>> >> leader changes between two incremental backups. (I'll try to expand
>> >> on this in the SIP, as it's a bit hard to give the full context here.)
>> >>
>> >> 3. My intention was that the 'maxNumBackups' parameter would only
>> >> refer to the incremental backups in a given location. This was mostly
>> >> informed by the fact that traditional backups today are required to be
>> >> 1-per-location. (i.e. a backup in 8.6.3 will error out if the
>> >> specified directory has files in it.). We could fix that aspect of
>> >> traditional backups and find semantics for 'maxNumBackups' that might
>> >> include traditional ones, but IMO it'd add complexity and work for a
>> >> format that the SIP is trying to replace more broadly anyways.
>> >>
>> >> 4. I definitely intended to update LocalFileSystemRepository. I have
>> >> code to update HdfsBackupRepository as well, but wasn't quite sure
>> >> where that stood since it's currently deprecated. I haven't seen
>> >> plans to make it a plugin, but might've just missed those discussions
>> >> in other mail. Anyway, I plan to update it but that assumes it's
>> >> sticking around in one form or another.
>> >>
>> >> 5. Good idea - I didn't realize that was an option. But it would be
>> >> really nice if possible. I don't have an estimate on resources. I
>> >> expect the need would be relatively small - you could restrict the
>> >> tests to running on the nightly runs on ASF's Jenkins unless devs
>> >> provide their own (e.g.) s3 creds. But that's just a guess obviously,
>> >> and not even in concrete terms.
>> >>
>> >> Thanks again for taking the time to wade through the SIP - really
>> >> appreciate the feedback. Hope the answers help!
>> >>
>> >> Best,
>> >>
>> >> Jason
>> >>
>> >> On Tue, Jan 5, 2021 at 11:52 AM Mike Drob <mdrob@mdrob.com> wrote:
>> >> >
>> >> > This is a very thorough SIP, thank you for spending the time on it,
>> Jason!
>> >> >
>> >> > I have a few minor questions about points that are unclear to me.
>> >> >
>> >> > 1) If we assume that we cannot overwrite files, how does the
>> manifest file stay current for incremental backup operations to the same
>> directory?
>> >> > 2) How is the manifest file functionally different from the
>> segments_n and segments.gen files?
>> >> > 3) Does the maxNumBackups parameter consider incremental backups or
>> only full backups? What happens if we have a full backup and then N
>> incremental ones? Do we delete the full backup and convert the oldest
>> incremental one into a full? I imagine this might be a metadata operation,
>> but then the concerns from question 1 apply.
>> >> > 4) Do we plan to retrofit HDFS Backup and Local File Backup to use
>> the new interfaces? I believe we should, but may be willing to accept this
>> as out of scope.
>> >> > 5) Regarding cloud provider test resources, we can also approach the
>> ASF Infra team to ask for cloud credits. Can you give rough estimates on
>> what kind of resourcing would be needed?
>> >> >
>> >> > I did not examine the new APIs in detail, but they looked fine at a
>> high level overview. Will probably look again after questions regarding
>> v1/v2 are figured out.
>> >> >
>> >> > On Tue, Jan 5, 2021 at 10:11 AM Mike Drob <mdrob@mdrob.com> wrote:
>> >> >>
>> >> >> Can you explicitly call out in the SIP how it relates to the work
>> done in SOLR-13608?
>> >> >>
>> >> >> On Tue, Jan 5, 2021 at 8:55 AM Jason Gerlowski <
>> gerlowskija@gmail.com> wrote:
>> >> >>>
>> >> >>> Hey, Happy New Year everybody.
>> >> >>>
>> >> >>> Some SIP updates based on the discussion above:
>> >> >>>
>> >> >>> I added v2 examples for each API to the SIP. Feedback welcome,
>> >> >>> especially on the v2 APIs that are net-new to this proposal
>> (namely:
>> >> >>> "list backups" and "delete backup").
>> >> >>>
>> >> >>> I've also amended the backcompat/migration section to mention Jan's
>> >> >>> suggestion that the "incremental" features be exposed in the v2 API
>> >> >>> only. Though it's unclear to me whether that's still something
>> people
>> >> >>> want since it turns out that we'll still have backcompat concerns
>> with
>> >> >>> the existing v2 backup/restore APIs. So I've held off from
>> >> >>> removing/replacing the original plan.
>> >> >>>
>> >> >>> Link for convenience:
>> >> >>>
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> >> >>>
>> >> >>> Best,
>> >> >>>
>> >> >>> Jason
>> >> >>>
>> >> >>>
>> >> >>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com>
>> wrote:
>> >> >>> >
>> >> >>> > Ok, that’s the one I was looking for, it’s not documented in the
>> backup chapter of ref-guide :(
>> >> >>> >
>> >> >>> > Jan Høydahl
>> >> >>> >
>> >> >>> > > 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <
>> gerlowskija@gmail.com>:
>> >> >>> > >
>> >> >>> > > ?
>> >> >>> > >>
>> >> >>> > >> We have a path alias to the old API ... but we don’t have a
>> true v2 API spec for it, do we?
>> >> >>> > >
>> >> >>> > > Tbh I'm not yet familiar enough with the v2 APIs to understand
>> the
>> >> >>> > > distinction you're making. (Do you have a pointer to
>> something that'd
>> >> >>> > > fill me in?)
>> >> >>> > >
>> >> >>> > > To zoom in on "backup" as an example, the v2 API I'm referring
>> to
>> >> >>> > > looks like: /v2/collections" -d '{ "backup-collection":
>> >> >>> > > {"collection": "books", "name": "asdf3", "location":
>> "/tmp/foo"}}'.
>> >> >>> > > And it's included in the v2 "introspect" documentation
>> returned by
>> >> >>> > > this API:
>> /v2/collections/_introspect?command=backup-collection". To
>> >> >>> > > me that looked like a v2 API, but maybe path-aliases are also
>> covered
>> >> >>> > > in the introspect docs.
>> >> >>> > >
>> >> >>> > > Jason
>> >> >>> > >
>> >> >>> > >> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <
>> jan.asf@cominvent.com> wrote:
>> >> >>> > >>
>> >> >>> > >> Actually, don’t think we do have a v2 Backup/Restore API. We
>> have a path alias to the old API which takes GET ...&action=backup... but
>> we don’t have a true v2 API spec for it, do we? Where is that documented?
>> >> >>> > >>
>> >> >>> > >> Jan Høydahl
>> >> >>> > >>
>> >> >>> > >>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
>> gerlowskija@gmail.com>:
>> >> >>> > >>>
>> >> >>> > >>> ?Hey guys,
>> >> >>> > >>>
>> >> >>> > >>> Following up to make sure I understand the specifics you're
>> >> >>> > >>> suggesting. You're proposing that:
>> >> >>> > >>>
>> >> >>> > >>> 1. The brand new backup-related APIs (list-backups and
>> delete-backup)
>> >> >>> > >>> be added in v2-form only.
>> >> >>> > >>> 2. Tweaks to existing backup-related APIs (create-backup,
>> restore) be
>> >> >>> > >>> made in V2-form only.
>> >> >>> > >>> 3. All existing v1 backup-related APIs be deprecated and left
>> >> >>> > >>> unchanged. Incremental backups will not be possible using
>> the v1 API.
>> >> >>> > >>>
>> >> >>> > >>> I'm not against going this route if there's consensus around
>> it. But
>> >> >>> > >>> I'm not 100% clear on how it means we don't need to worry
>> about
>> >> >>> > >>> backcompat. Backup and Restore currently exist as both a v1
>> and a v2
>> >> >>> > >>> API - I understand how leaving the v1 APIs untouched (other
>> than
>> >> >>> > >>> deprecation) frees us of some backcompat concerns there, but
>> we would
>> >> >>> > >>> still need to make tweaks to the v2 backup/restore APIs and
>> would have
>> >> >>> > >>> to tread just as carefully there in terms of backcompat,
>> afaict.
>> >> >>> > >>> Unless Solr's backcompatibility guarantees only cover the v1
>> API and
>> >> >>> > >>> leave v2 changes to be made freely? I looked around to see
>> if the v2
>> >> >>> > >>> APIs had any sort of "experimental" designation, but
>> couldn't find
>> >> >>> > >>> that clearly stated anywhere. Am I missing something?
>> >> >>> > >>>
>> >> >>> > >>> Best,
>> >> >>> > >>>
>> >> >>> > >>> Jason
>> >> >>> > >>>
>> >> >>> > >>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <
>> noble.paul@gmail.com> wrote:
>> >> >>> > >>>>
>> >> >>> > >>>>> , and implement the new imporved version as a V2-api only,
>> and then deprecate the v1 API?
>> >> >>> > >>>>
>> >> >>> > >>>>
>> >> >>> > >>>> V2 only please
>> >> >>> > >>>>
>> >> >>> > >>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
>> gerlowskija@gmail.com> wrote:
>> >> >>> > >>>>>
>> >> >>> > >>>>> Hey Jan, thanks for the review.
>> >> >>> > >>>>>
>> >> >>> > >>>>> I hadn't thought about the V2 API in connection to this
>> work. You're
>> >> >>> > >>>>> right though I think - the SIP proposes net-new APIs, so
>> it should add
>> >> >>> > >>>>> V2 equivalents at the very least. I'll draft tentative
>> details for
>> >> >>> > >>>>> these APIs on the SIP and we can refine things from there.
>> >> >>> > >>>>>
>> >> >>> > >>>>> I'm more up in the air on your specific suggestion to
>> restrict the SIP
>> >> >>> > >>>>> changes to these v2 APIs. It is an elegant approach to the
>> >> >>> > >>>>> backcompat, and it provides a carrot for v2 adoption -
>> both of which I
>> >> >>> > >>>>> like. But it would let users create snapshot-based
>> backups (and keep
>> >> >>> > >>>>> us maintaining that code) longer than there's any strict
>> need to. And
>> >> >>> > >>>>> users are left on the less-efficient format by default.
>> (By contrast,
>> >> >>> > >>>>> the current SIP has snapshot-backup creation being
>> replaced by
>> >> >>> > >>>>> incremental-backup creation as soon as the latter is
>> available.). Did
>> >> >>> > >>>>> you have a particular lifespan in mind for snapshot-based
>> creation if
>> >> >>> > >>>>> we go with this approach?
>> >> >>> > >>>>>
>> >> >>> > >>>>> Jason
>> >> >>> > >>>>>
>> >> >>> > >>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <
>> jan.asf@cominvent.com> wrote:
>> >> >>> > >>>>>>
>> >> >>> > >>>>>> Much needed! Thanks for initiating this Jason!
>> >> >>> > >>>>>>
>> >> >>> > >>>>>> As we want to move away from v1 APIs where a HTTP GET is
>> used for creation and deletion, would it be an idea to leave the old
>> backup/resotre APIs as-is, and implement the new imporved version as a
>> V2-api only, and then deprecate the v1 API? Then we don't need to worry
>> about back-compat, and we get a head-start on converting the COLLECTION API
>> to v2 style.
>> >> >>> > >>>>>>
>> >> >>> > >>>>>> Jan
>> >> >>> > >>>>>>
>> >> >>> > >>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
>> gerlowskija@gmail.com>:
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> Hey all,
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> This morning I published SIP-12, which proposes an
>> overhaul of Solr's
>> >> >>> > >>>>>>> backup and restore functionality. While the "headline"
>> improvement in
>> >> >>> > >>>>>>> this SIP is a change to do backups incrementally, it
>> bundles in a
>> >> >>> > >>>>>>> number of other improvements as well, including the
>> addition of
>> >> >>> > >>>>>>> corruption checks, APIs to list and delete backups, and
>> stronger
>> >> >>> > >>>>>>> integration points with popular object storage APIs.
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> The SIP can be found here:
>> >> >>> > >>>>>>>
>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> Please read the SIP description and come back here for
>> discussion. As
>> >> >>> > >>>>>>> the discussion progresses we will update the SIP page
>> with any
>> >> >>> > >>>>>>> outcomes and eventually move things to a VOTE.
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> Looking forward to hearing your feedback.
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> Best,
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>> Jason
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>>
>> ---------------------------------------------------------------------
>> >> >>> > >>>>>>> To unsubscribe, e-mail:
>> dev-unsubscribe@lucene.apache.org
>> >> >>> > >>>>>>> For additional commands, e-mail:
>> dev-help@lucene.apache.org
>> >> >>> > >>>>>>>
>> >> >>> > >>>>>>
>> >> >>> > >>>>>>
>> >> >>> > >>>>>>
>> ---------------------------------------------------------------------
>> >> >>> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > >>>>>> For additional commands, e-mail:
>> dev-help@lucene.apache.org
>> >> >>> > >>>>>>
>> >> >>> > >>>>>
>> >> >>> > >>>>>
>> ---------------------------------------------------------------------
>> >> >>> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > >>>>> For additional commands, e-mail:
>> dev-help@lucene.apache.org
>> >> >>> > >>>>>
>> >> >>> > >>>>
>> >> >>> > >>>>
>> >> >>> > >>>> --
>> >> >>> > >>>> -----------------------------------------------------
>> >> >>> > >>>> Noble Paul
>> >> >>> > >>>>
>> >> >>> > >>>>
>> ---------------------------------------------------------------------
>> >> >>> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>> > >>>>
>> >> >>> > >>>
>> >> >>> > >>>
>> ---------------------------------------------------------------------
>> >> >>> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>> > >>>
>> >> >>> > >>
>> >> >>> > >>
>> ---------------------------------------------------------------------
>> >> >>> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>> > >>
>> >> >>> > >
>> >> >>> > >
>> ---------------------------------------------------------------------
>> >> >>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>> > >
>> >> >>> >
>> >> >>> >
>> ---------------------------------------------------------------------
>> >> >>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>> >
>> >> >>>
>> >> >>>
>> ---------------------------------------------------------------------
>> >> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!

> I've also amended the backcompat/migration section to mention Jan's
> suggestion that the "incremental" features be exposed in the v2 API
> only. Though it's unclear to me whether that's still something people
> want since it turns out that we'll still have backcompat concerns with
> the existing v2 backup/restore APIs. So I've held off from
> removing/replacing the original plan.

Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.

Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?

Jan

> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>
> Hey, Happy New Year everybody.
>
> Some SIP updates based on the discussion above:
>
> I added v2 examples for each API to the SIP. Feedback welcome,
> especially on the v2 APIs that are net-new to this proposal (namely:
> "list backups" and "delete backup").
>
> I've also amended the backcompat/migration section to mention Jan's
> suggestion that the "incremental" features be exposed in the v2 API
> only. Though it's unclear to me whether that's still something people
> want since it turns out that we'll still have backcompat concerns with
> the existing v2 backup/restore APIs. So I've held off from
> removing/replacing the original plan.
>
> Link for convenience:
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>
> Best,
>
> Jason
>
>
> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>
>> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>>
>> Jan Høydahl
>>
>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>
>>> ?
>>>>
>>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>>>
>>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
>>> distinction you're making. (Do you have a pointer to something that'd
>>> fill me in?)
>>>
>>> To zoom in on "backup" as an example, the v2 API I'm referring to
>>> looks like: /v2/collections" -d '{ "backup-collection":
>>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>>> And it's included in the v2 "introspect" documentation returned by
>>> this API: /v2/collections/_introspect?command=backup-collection". To
>>> me that looked like a v2 API, but maybe path-aliases are also covered
>>> in the introspect docs.
>>>
>>> Jason
>>>
>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>
>>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>>>>
>>>> Jan Høydahl
>>>>
>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>
>>>>> ?Hey guys,
>>>>>
>>>>> Following up to make sure I understand the specifics you're
>>>>> suggesting. You're proposing that:
>>>>>
>>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>>>>> be added in v2-form only.
>>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>>>>> made in V2-form only.
>>>>> 3. All existing v1 backup-related APIs be deprecated and left
>>>>> unchanged. Incremental backups will not be possible using the v1 API.
>>>>>
>>>>> I'm not against going this route if there's consensus around it. But
>>>>> I'm not 100% clear on how it means we don't need to worry about
>>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>>>>> API - I understand how leaving the v1 APIs untouched (other than
>>>>> deprecation) frees us of some backcompat concerns there, but we would
>>>>> still need to make tweaks to the v2 backup/restore APIs and would have
>>>>> to tread just as carefully there in terms of backcompat, afaict.
>>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>>>>> leave v2 changes to be made freely? I looked around to see if the v2
>>>>> APIs had any sort of "experimental" designation, but couldn't find
>>>>> that clearly stated anywhere. Am I missing something?
>>>>>
>>>>> Best,
>>>>>
>>>>> Jason
>>>>>
>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>>>>>>
>>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>>>>>>
>>>>>>
>>>>>> V2 only please
>>>>>>
>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>>>>>>
>>>>>>> Hey Jan, thanks for the review.
>>>>>>>
>>>>>>> I hadn't thought about the V2 API in connection to this work. You're
>>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
>>>>>>> V2 equivalents at the very least. I'll draft tentative details for
>>>>>>> these APIs on the SIP and we can refine things from there.
>>>>>>>
>>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>>>>>>> changes to these v2 APIs. It is an elegant approach to the
>>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>>>>>>> like. But it would let users create snapshot-based backups (and keep
>>>>>>> us maintaining that code) longer than there's any strict need to. And
>>>>>>> users are left on the less-efficient format by default. (By contrast,
>>>>>>> the current SIP has snapshot-backup creation being replaced by
>>>>>>> incremental-backup creation as soon as the latter is available.). Did
>>>>>>> you have a particular lifespan in mind for snapshot-based creation if
>>>>>>> we go with this approach?
>>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>>>>>
>>>>>>>> Much needed! Thanks for initiating this Jason!
>>>>>>>>
>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>>>>>>>>
>>>>>>>> Jan
>>>>>>>>
>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>>>>>
>>>>>>>>> Hey all,
>>>>>>>>>
>>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>>>>>>>>> backup and restore functionality. While the "headline" improvement in
>>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>>>>>>>>> number of other improvements as well, including the addition of
>>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
>>>>>>>>> integration points with popular object storage APIs.
>>>>>>>>>
>>>>>>>>> The SIP can be found here:
>>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>>>>>>>
>>>>>>>>> Please read the SIP description and come back here for discussion. As
>>>>>>>>> the discussion progresses we will update the SIP page with any
>>>>>>>>> outcomes and eventually move things to a VOTE.
>>>>>>>>>
>>>>>>>>> Looking forward to hearing your feedback.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Jason
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -----------------------------------------------------
>>>>>> Noble Paul
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Hey all,

I've put replies to everyone's questions below. Hope they help!

> Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?

Mike: The former - they're intended to hold metadata about all of the
segments that are needed to restore to the given
snapshot/commit-point. So it's likely to hold metadata about files
just uploaded, as well as ones that were added to the blob by previous
backups. I'll see if I can make that clearer in the file
descriptions.

> leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup

Jan: Ultimately I agree with your concerns about scope, so I'd vote
against trying to cover TRAs, multiple collection backups, etc. in
this effort here.

That aside though, I agree that the existing v2 backup API is a bit of
a headscratcher. Why is it /v2/collections instead of
/v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
have something to do with aliases? Or did it end up there mostly by
default? I'd be open to creating a new v2 backup endpoint (without
adding TRA, etc. compatibility) if there was consensus on that
approach to handling backcompat and on the specific appearance of the
API. It would help with backcompat after all. Though if finding
consensus bogs down it may not be worth the addition.

> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed

David: SIP-12 does address this, though maybe the writeup needs
clarifying. The Delete Backup API includes a "purge" parameter which
triggers GC activity. This probably works about the way you'd expect
- Solr gets the list of UUID-named index files from the blob store,
and then it compares that list to the set of UUID's referenced by any
shard-metadata file (which requires reading all the shard-metadata
files). This avoids adding to Solr's ZK state, but does so at the
cost of requiring users to trigger sporadic cleanup manually instead
of detecting orphans automatically like BlobDirectory does (assuming I
understand that correctly).

I'm def not saying this is the best approach necessarily. I like it,
though it has downsides for sure. Just that there is a proposed
approach that's easy to miss buried in the SIP.

More broadly though - I share your sense that we should consider
alignment. It may end up that Backup/Restore is different enough from
the BlobDirectory usecase that it doesn't make sense, but it's at
least worth figuring out. That's about as far as my understanding
goes right now though. I'll read up on BlobDirectory while you absorb
SIP-12 and maybe we can circle back to this shortly.

Best,

Jason

On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
>
> > I've also amended the backcompat/migration section to mention Jan's
> > suggestion that the "incremental" features be exposed in the v2 API
> > only. Though it's unclear to me whether that's still something people
> > want since it turns out that we'll still have backcompat concerns with
> > the existing v2 backup/restore APIs. So I've held off from
> > removing/replacing the original plan.
>
> Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
>
> Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
>
> Jan
>
> > 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >
> > Hey, Happy New Year everybody.
> >
> > Some SIP updates based on the discussion above:
> >
> > I added v2 examples for each API to the SIP. Feedback welcome,
> > especially on the v2 APIs that are net-new to this proposal (namely:
> > "list backups" and "delete backup").
> >
> > I've also amended the backcompat/migration section to mention Jan's
> > suggestion that the "incremental" features be exposed in the v2 API
> > only. Though it's unclear to me whether that's still something people
> > want since it turns out that we'll still have backcompat concerns with
> > the existing v2 backup/restore APIs. So I've held off from
> > removing/replacing the original plan.
> >
> > Link for convenience:
> > https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >
> > Best,
> >
> > Jason
> >
> >
> > On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>
> >> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
> >>
> >> Jan Høydahl
> >>
> >>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>
> >>> ?
> >>>>
> >>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
> >>>
> >>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
> >>> distinction you're making. (Do you have a pointer to something that'd
> >>> fill me in?)
> >>>
> >>> To zoom in on "backup" as an example, the v2 API I'm referring to
> >>> looks like: /v2/collections" -d '{ "backup-collection":
> >>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> >>> And it's included in the v2 "introspect" documentation returned by
> >>> this API: /v2/collections/_introspect?command=backup-collection". To
> >>> me that looked like a v2 API, but maybe path-aliases are also covered
> >>> in the introspect docs.
> >>>
> >>> Jason
> >>>
> >>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>
> >>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
> >>>>
> >>>> Jan Høydahl
> >>>>
> >>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>
> >>>>> ?Hey guys,
> >>>>>
> >>>>> Following up to make sure I understand the specifics you're
> >>>>> suggesting. You're proposing that:
> >>>>>
> >>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> >>>>> be added in v2-form only.
> >>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> >>>>> made in V2-form only.
> >>>>> 3. All existing v1 backup-related APIs be deprecated and left
> >>>>> unchanged. Incremental backups will not be possible using the v1 API.
> >>>>>
> >>>>> I'm not against going this route if there's consensus around it. But
> >>>>> I'm not 100% clear on how it means we don't need to worry about
> >>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> >>>>> API - I understand how leaving the v1 APIs untouched (other than
> >>>>> deprecation) frees us of some backcompat concerns there, but we would
> >>>>> still need to make tweaks to the v2 backup/restore APIs and would have
> >>>>> to tread just as carefully there in terms of backcompat, afaict.
> >>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> >>>>> leave v2 changes to be made freely? I looked around to see if the v2
> >>>>> APIs had any sort of "experimental" designation, but couldn't find
> >>>>> that clearly stated anywhere. Am I missing something?
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> >>>>>>
> >>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> >>>>>>
> >>>>>>
> >>>>>> V2 only please
> >>>>>>
> >>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Hey Jan, thanks for the review.
> >>>>>>>
> >>>>>>> I hadn't thought about the V2 API in connection to this work. You're
> >>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
> >>>>>>> V2 equivalents at the very least. I'll draft tentative details for
> >>>>>>> these APIs on the SIP and we can refine things from there.
> >>>>>>>
> >>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
> >>>>>>> changes to these v2 APIs. It is an elegant approach to the
> >>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
> >>>>>>> like. But it would let users create snapshot-based backups (and keep
> >>>>>>> us maintaining that code) longer than there's any strict need to. And
> >>>>>>> users are left on the less-efficient format by default. (By contrast,
> >>>>>>> the current SIP has snapshot-backup creation being replaced by
> >>>>>>> incremental-backup creation as soon as the latter is available.). Did
> >>>>>>> you have a particular lifespan in mind for snapshot-based creation if
> >>>>>>> we go with this approach?
> >>>>>>>
> >>>>>>> Jason
> >>>>>>>
> >>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>>>>>
> >>>>>>>> Much needed! Thanks for initiating this Jason!
> >>>>>>>>
> >>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> >>>>>>>>
> >>>>>>>> Jan
> >>>>>>>>
> >>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>>>>>
> >>>>>>>>> Hey all,
> >>>>>>>>>
> >>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> >>>>>>>>> backup and restore functionality. While the "headline" improvement in
> >>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> >>>>>>>>> number of other improvements as well, including the addition of
> >>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
> >>>>>>>>> integration points with popular object storage APIs.
> >>>>>>>>>
> >>>>>>>>> The SIP can be found here:
> >>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>>>>>>>
> >>>>>>>>> Please read the SIP description and come back here for discussion. As
> >>>>>>>>> the discussion progresses we will update the SIP page with any
> >>>>>>>>> outcomes and eventually move things to a VOTE.
> >>>>>>>>>
> >>>>>>>>> Looking forward to hearing your feedback.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Jason
> >>>>>>>>>
> >>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -----------------------------------------------------
> >>>>>> Noble Paul
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Hey all,

Two follow ups on recent discussion.

I reviewed the gc/ref-counting part of the BlobDirectory proposal on
SOLR-15051 that David mentioned. We talked about it a bit offline and
agreed that while an automatic gc mechanism is really needed for what
he's trying to do, the requirements of the backup usecase are
different enough that SIP-12 can get by with manually-triggered
'purging'. Mostly because infrequent static backups produce much less
garbage than continually tracking all files for a (possibly
ever-changing) index.

> I'd be open to creating a new v2 backup endpoint (without adding TRA, etc. compatibility) if there was consensus on that approach to handling backcompat and on the specific appearance of the API

On second thought, I'm going to flip-flop on this. Coming up with a
better v2 API for backup/restore will be easier *after* some of the
questions Jan raised (multi-collection? alias support? etc.) have been
dealt with. i.e. It's tough to decide between /v2/cluster/backups and
/v2/collections/<collectionName>/backups as alternatives until you
figure out whether we currently support multi-collection backup, or
want to in the near future. If people feel strongly or would veto the
vote otherwise, then I'll try my best. But otherwise I think we're
best served waiting until other stuff settles out to revisit larger v2
backup API changes.

Best,

Jason

On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>
> Hey all,
>
> I've put replies to everyone's questions below. Hope they help!
>
> > Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?
>
> Mike: The former - they're intended to hold metadata about all of the
> segments that are needed to restore to the given
> snapshot/commit-point. So it's likely to hold metadata about files
> just uploaded, as well as ones that were added to the blob by previous
> backups. I'll see if I can make that clearer in the file
> descriptions.
>
> > leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup
>
> Jan: Ultimately I agree with your concerns about scope, so I'd vote
> against trying to cover TRAs, multiple collection backups, etc. in
> this effort here.
>
> That aside though, I agree that the existing v2 backup API is a bit of
> a headscratcher. Why is it /v2/collections instead of
> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
> have something to do with aliases? Or did it end up there mostly by
> default? I'd be open to creating a new v2 backup endpoint (without
> adding TRA, etc. compatibility) if there was consensus on that
> approach to handling backcompat and on the specific appearance of the
> API. It would help with backcompat after all. Though if finding
> consensus bogs down it may not be worth the addition.
>
> > I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed
>
> David: SIP-12 does address this, though maybe the writeup needs
> clarifying. The Delete Backup API includes a "purge" parameter which
> triggers GC activity. This probably works about the way you'd expect
> - Solr gets the list of UUID-named index files from the blob store,
> and then it compares that list to the set of UUID's referenced by any
> shard-metadata file (which requires reading all the shard-metadata
> files). This avoids adding to Solr's ZK state, but does so at the
> cost of requiring users to trigger sporadic cleanup manually instead
> of detecting orphans automatically like BlobDirectory does (assuming I
> understand that correctly).
>
> I'm def not saying this is the best approach necessarily. I like it,
> though it has downsides for sure. Just that there is a proposed
> approach that's easy to miss buried in the SIP.
>
> More broadly though - I share your sense that we should consider
> alignment. It may end up that Backup/Restore is different enough from
> the BlobDirectory usecase that it doesn't make sense, but it's at
> least worth figuring out. That's about as far as my understanding
> goes right now though. I'll read up on BlobDirectory while you absorb
> SIP-12 and maybe we can circle back to this shortly.
>
> Best,
>
> Jason
>
> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >
> > Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
> >
> > > I've also amended the backcompat/migration section to mention Jan's
> > > suggestion that the "incremental" features be exposed in the v2 API
> > > only. Though it's unclear to me whether that's still something people
> > > want since it turns out that we'll still have backcompat concerns with
> > > the existing v2 backup/restore APIs. So I've held off from
> > > removing/replacing the original plan.
> >
> > Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
> >
> > Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
> >
> > Jan
> >
> > > 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >
> > > Hey, Happy New Year everybody.
> > >
> > > Some SIP updates based on the discussion above:
> > >
> > > I added v2 examples for each API to the SIP. Feedback welcome,
> > > especially on the v2 APIs that are net-new to this proposal (namely:
> > > "list backups" and "delete backup").
> > >
> > > I've also amended the backcompat/migration section to mention Jan's
> > > suggestion that the "incremental" features be exposed in the v2 API
> > > only. Though it's unclear to me whether that's still something people
> > > want since it turns out that we'll still have backcompat concerns with
> > > the existing v2 backup/restore APIs. So I've held off from
> > > removing/replacing the original plan.
> > >
> > > Link for convenience:
> > > https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >
> > > Best,
> > >
> > > Jason
> > >
> > >
> > > On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>
> > >> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
> > >>
> > >> Jan Høydahl
> > >>
> > >>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>
> > >>> ?
> > >>>>
> > >>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
> > >>>
> > >>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
> > >>> distinction you're making. (Do you have a pointer to something that'd
> > >>> fill me in?)
> > >>>
> > >>> To zoom in on "backup" as an example, the v2 API I'm referring to
> > >>> looks like: /v2/collections" -d '{ "backup-collection":
> > >>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> > >>> And it's included in the v2 "introspect" documentation returned by
> > >>> this API: /v2/collections/_introspect?command=backup-collection". To
> > >>> me that looked like a v2 API, but maybe path-aliases are also covered
> > >>> in the introspect docs.
> > >>>
> > >>> Jason
> > >>>
> > >>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>>
> > >>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
> > >>>>
> > >>>> Jan Høydahl
> > >>>>
> > >>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>>
> > >>>>> ?Hey guys,
> > >>>>>
> > >>>>> Following up to make sure I understand the specifics you're
> > >>>>> suggesting. You're proposing that:
> > >>>>>
> > >>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> > >>>>> be added in v2-form only.
> > >>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> > >>>>> made in V2-form only.
> > >>>>> 3. All existing v1 backup-related APIs be deprecated and left
> > >>>>> unchanged. Incremental backups will not be possible using the v1 API.
> > >>>>>
> > >>>>> I'm not against going this route if there's consensus around it. But
> > >>>>> I'm not 100% clear on how it means we don't need to worry about
> > >>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> > >>>>> API - I understand how leaving the v1 APIs untouched (other than
> > >>>>> deprecation) frees us of some backcompat concerns there, but we would
> > >>>>> still need to make tweaks to the v2 backup/restore APIs and would have
> > >>>>> to tread just as carefully there in terms of backcompat, afaict.
> > >>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> > >>>>> leave v2 changes to be made freely? I looked around to see if the v2
> > >>>>> APIs had any sort of "experimental" designation, but couldn't find
> > >>>>> that clearly stated anywhere. Am I missing something?
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Jason
> > >>>>>
> > >>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> > >>>>>>
> > >>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> > >>>>>>
> > >>>>>>
> > >>>>>> V2 only please
> > >>>>>>
> > >>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> > >>>>>>>
> > >>>>>>> Hey Jan, thanks for the review.
> > >>>>>>>
> > >>>>>>> I hadn't thought about the V2 API in connection to this work. You're
> > >>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
> > >>>>>>> V2 equivalents at the very least. I'll draft tentative details for
> > >>>>>>> these APIs on the SIP and we can refine things from there.
> > >>>>>>>
> > >>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
> > >>>>>>> changes to these v2 APIs. It is an elegant approach to the
> > >>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
> > >>>>>>> like. But it would let users create snapshot-based backups (and keep
> > >>>>>>> us maintaining that code) longer than there's any strict need to. And
> > >>>>>>> users are left on the less-efficient format by default. (By contrast,
> > >>>>>>> the current SIP has snapshot-backup creation being replaced by
> > >>>>>>> incremental-backup creation as soon as the latter is available.). Did
> > >>>>>>> you have a particular lifespan in mind for snapshot-based creation if
> > >>>>>>> we go with this approach?
> > >>>>>>>
> > >>>>>>> Jason
> > >>>>>>>
> > >>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>>>>>>
> > >>>>>>>> Much needed! Thanks for initiating this Jason!
> > >>>>>>>>
> > >>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> > >>>>>>>>
> > >>>>>>>> Jan
> > >>>>>>>>
> > >>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>>>>>>
> > >>>>>>>>> Hey all,
> > >>>>>>>>>
> > >>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> > >>>>>>>>> backup and restore functionality. While the "headline" improvement in
> > >>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> > >>>>>>>>> number of other improvements as well, including the addition of
> > >>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
> > >>>>>>>>> integration points with popular object storage APIs.
> > >>>>>>>>>
> > >>>>>>>>> The SIP can be found here:
> > >>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >>>>>>>>>
> > >>>>>>>>> Please read the SIP description and come back here for discussion. As
> > >>>>>>>>> the discussion progresses we will update the SIP page with any
> > >>>>>>>>> outcomes and eventually move things to a VOTE.
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to hearing your feedback.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Jason
> > >>>>>>>>>
> > >>>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> -----------------------------------------------------
> > >>>>>> Noble Paul
> > >>>>>>
> > >>>>>> ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>
> > >>>>>
> > >>>>> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>
> > >>>>
> > >>>> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
> It's tough to decide between /v2/cluster/backups and
> /v2/collections/<collectionName>/backups as alternatives until you
> figure out whether we currently support multi-collection backup, or
> want to in the near future.

I suppose multi-collection / TRA support cold be expanded on later
by supporting e.g. collection=<regex> or alias=my-tra.

However, the file layout chosen here dictates whether one named backup
will be capable of storing more than one collection in the future, so that's
perhaps something to consider. But if it gets too complicated we should
just delay it and redesign the storage structure once again when we cross
that bridge. I'll not veto the current suggestion.

Jan

> 12. jan. 2021 kl. 17:53 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>
> Hey all,
>
> Two follow ups on recent discussion.
>
> I reviewed the gc/ref-counting part of the BlobDirectory proposal on
> SOLR-15051 that David mentioned. We talked about it a bit offline and
> agreed that while an automatic gc mechanism is really needed for what
> he's trying to do, the requirements of the backup usecase are
> different enough that SIP-12 can get by with manually-triggered
> 'purging'. Mostly because infrequent static backups produce much less
> garbage than continually tracking all files for a (possibly
> ever-changing) index.
>
>> I'd be open to creating a new v2 backup endpoint (without adding TRA, etc. compatibility) if there was consensus on that approach to handling backcompat and on the specific appearance of the API
>
> On second thought, I'm going to flip-flop on this. Coming up with a
> better v2 API for backup/restore will be easier *after* some of the
> questions Jan raised (multi-collection? alias support? etc.) have been
> dealt with. i.e. It's tough to decide between /v2/cluster/backups and
> /v2/collections/<collectionName>/backups as alternatives until you
> figure out whether we currently support multi-collection backup, or
> want to in the near future. If people feel strongly or would veto the
> vote otherwise, then I'll try my best. But otherwise I think we're
> best served waiting until other stuff settles out to revisit larger v2
> backup API changes.
>
> Best,
>
> Jason
>
> On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>
>> Hey all,
>>
>> I've put replies to everyone's questions below. Hope they help!
>>
>>> Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?
>>
>> Mike: The former - they're intended to hold metadata about all of the
>> segments that are needed to restore to the given
>> snapshot/commit-point. So it's likely to hold metadata about files
>> just uploaded, as well as ones that were added to the blob by previous
>> backups. I'll see if I can make that clearer in the file
>> descriptions.
>>
>>> leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup
>>
>> Jan: Ultimately I agree with your concerns about scope, so I'd vote
>> against trying to cover TRAs, multiple collection backups, etc. in
>> this effort here.
>>
>> That aside though, I agree that the existing v2 backup API is a bit of
>> a headscratcher. Why is it /v2/collections instead of
>> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
>> have something to do with aliases? Or did it end up there mostly by
>> default? I'd be open to creating a new v2 backup endpoint (without
>> adding TRA, etc. compatibility) if there was consensus on that
>> approach to handling backcompat and on the specific appearance of the
>> API. It would help with backcompat after all. Though if finding
>> consensus bogs down it may not be worth the addition.
>>
>>> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed
>>
>> David: SIP-12 does address this, though maybe the writeup needs
>> clarifying. The Delete Backup API includes a "purge" parameter which
>> triggers GC activity. This probably works about the way you'd expect
>> - Solr gets the list of UUID-named index files from the blob store,
>> and then it compares that list to the set of UUID's referenced by any
>> shard-metadata file (which requires reading all the shard-metadata
>> files). This avoids adding to Solr's ZK state, but does so at the
>> cost of requiring users to trigger sporadic cleanup manually instead
>> of detecting orphans automatically like BlobDirectory does (assuming I
>> understand that correctly).
>>
>> I'm def not saying this is the best approach necessarily. I like it,
>> though it has downsides for sure. Just that there is a proposed
>> approach that's easy to miss buried in the SIP.
>>
>> More broadly though - I share your sense that we should consider
>> alignment. It may end up that Backup/Restore is different enough from
>> the BlobDirectory usecase that it doesn't make sense, but it's at
>> least worth figuring out. That's about as far as my understanding
>> goes right now though. I'll read up on BlobDirectory while you absorb
>> SIP-12 and maybe we can circle back to this shortly.
>>
>> Best,
>>
>> Jason
>>
>> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>
>>> Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
>>>
>>>> I've also amended the backcompat/migration section to mention Jan's
>>>> suggestion that the "incremental" features be exposed in the v2 API
>>>> only. Though it's unclear to me whether that's still something people
>>>> want since it turns out that we'll still have backcompat concerns with
>>>> the existing v2 backup/restore APIs. So I've held off from
>>>> removing/replacing the original plan.
>>>
>>> Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
>>>
>>> Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
>>>
>>> Jan
>>>
>>>> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>
>>>> Hey, Happy New Year everybody.
>>>>
>>>> Some SIP updates based on the discussion above:
>>>>
>>>> I added v2 examples for each API to the SIP. Feedback welcome,
>>>> especially on the v2 APIs that are net-new to this proposal (namely:
>>>> "list backups" and "delete backup").
>>>>
>>>> I've also amended the backcompat/migration section to mention Jan's
>>>> suggestion that the "incremental" features be exposed in the v2 API
>>>> only. Though it's unclear to me whether that's still something people
>>>> want since it turns out that we'll still have backcompat concerns with
>>>> the existing v2 backup/restore APIs. So I've held off from
>>>> removing/replacing the original plan.
>>>>
>>>> Link for convenience:
>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>>
>>>> Best,
>>>>
>>>> Jason
>>>>
>>>>
>>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>>
>>>>> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>>>>>
>>>>> Jan Høydahl
>>>>>
>>>>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>>
>>>>>> ?
>>>>>>>
>>>>>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>>>>>>
>>>>>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
>>>>>> distinction you're making. (Do you have a pointer to something that'd
>>>>>> fill me in?)
>>>>>>
>>>>>> To zoom in on "backup" as an example, the v2 API I'm referring to
>>>>>> looks like: /v2/collections" -d '{ "backup-collection":
>>>>>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>>>>>> And it's included in the v2 "introspect" documentation returned by
>>>>>> this API: /v2/collections/_introspect?command=backup-collection". To
>>>>>> me that looked like a v2 API, but maybe path-aliases are also covered
>>>>>> in the introspect docs.
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>>>>
>>>>>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>>>>>>>
>>>>>>> Jan Høydahl
>>>>>>>
>>>>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>>>>
>>>>>>>> ?Hey guys,
>>>>>>>>
>>>>>>>> Following up to make sure I understand the specifics you're
>>>>>>>> suggesting. You're proposing that:
>>>>>>>>
>>>>>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>>>>>>>> be added in v2-form only.
>>>>>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>>>>>>>> made in V2-form only.
>>>>>>>> 3. All existing v1 backup-related APIs be deprecated and left
>>>>>>>> unchanged. Incremental backups will not be possible using the v1 API.
>>>>>>>>
>>>>>>>> I'm not against going this route if there's consensus around it. But
>>>>>>>> I'm not 100% clear on how it means we don't need to worry about
>>>>>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>>>>>>>> API - I understand how leaving the v1 APIs untouched (other than
>>>>>>>> deprecation) frees us of some backcompat concerns there, but we would
>>>>>>>> still need to make tweaks to the v2 backup/restore APIs and would have
>>>>>>>> to tread just as carefully there in terms of backcompat, afaict.
>>>>>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>>>>>>>> leave v2 changes to be made freely? I looked around to see if the v2
>>>>>>>> APIs had any sort of "experimental" designation, but couldn't find
>>>>>>>> that clearly stated anywhere. Am I missing something?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jason
>>>>>>>>
>>>>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> V2 only please
>>>>>>>>>
>>>>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hey Jan, thanks for the review.
>>>>>>>>>>
>>>>>>>>>> I hadn't thought about the V2 API in connection to this work. You're
>>>>>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
>>>>>>>>>> V2 equivalents at the very least. I'll draft tentative details for
>>>>>>>>>> these APIs on the SIP and we can refine things from there.
>>>>>>>>>>
>>>>>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>>>>>>>>>> changes to these v2 APIs. It is an elegant approach to the
>>>>>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>>>>>>>>>> like. But it would let users create snapshot-based backups (and keep
>>>>>>>>>> us maintaining that code) longer than there's any strict need to. And
>>>>>>>>>> users are left on the less-efficient format by default. (By contrast,
>>>>>>>>>> the current SIP has snapshot-backup creation being replaced by
>>>>>>>>>> incremental-backup creation as soon as the latter is available.). Did
>>>>>>>>>> you have a particular lifespan in mind for snapshot-based creation if
>>>>>>>>>> we go with this approach?
>>>>>>>>>>
>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Much needed! Thanks for initiating this Jason!
>>>>>>>>>>>
>>>>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>>>>>>>>>>>
>>>>>>>>>>> Jan
>>>>>>>>>>>
>>>>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey all,
>>>>>>>>>>>>
>>>>>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>>>>>>>>>>>> backup and restore functionality. While the "headline" improvement in
>>>>>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>>>>>>>>>>>> number of other improvements as well, including the addition of
>>>>>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
>>>>>>>>>>>> integration points with popular object storage APIs.
>>>>>>>>>>>>
>>>>>>>>>>>> The SIP can be found here:
>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>>>>>>>>>>>>
>>>>>>>>>>>> Please read the SIP description and come back here for discussion. As
>>>>>>>>>>>> the discussion progresses we will update the SIP page with any
>>>>>>>>>>>> outcomes and eventually move things to a VOTE.
>>>>>>>>>>>>
>>>>>>>>>>>> Looking forward to hearing your feedback.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>>
>>>>>>>>>>>> Jason
>>>>>>>>>>>>
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -----------------------------------------------------
>>>>>>>>> Noble Paul
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Yeah, that's a good point. I think the only change we'd need to make
to the file-structure to support multi-collection down the road would
be to introduce a top-level directories for each collection. I'll
experiment with that and tweak the described file structure to handle
that (assuming the testing pans out).

Best,

Jason

On Thu, Jan 14, 2021 at 5:10 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> > It's tough to decide between /v2/cluster/backups and
> > /v2/collections/<collectionName>/backups as alternatives until you
> > figure out whether we currently support multi-collection backup, or
> > want to in the near future.
>
> I suppose multi-collection / TRA support cold be expanded on later
> by supporting e.g. collection=<regex> or alias=my-tra.
>
> However, the file layout chosen here dictates whether one named backup
> will be capable of storing more than one collection in the future, so that's
> perhaps something to consider. But if it gets too complicated we should
> just delay it and redesign the storage structure once again when we cross
> that bridge. I'll not veto the current suggestion.
>
> Jan
>
> > 12. jan. 2021 kl. 17:53 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >
> > Hey all,
> >
> > Two follow ups on recent discussion.
> >
> > I reviewed the gc/ref-counting part of the BlobDirectory proposal on
> > SOLR-15051 that David mentioned. We talked about it a bit offline and
> > agreed that while an automatic gc mechanism is really needed for what
> > he's trying to do, the requirements of the backup usecase are
> > different enough that SIP-12 can get by with manually-triggered
> > 'purging'. Mostly because infrequent static backups produce much less
> > garbage than continually tracking all files for a (possibly
> > ever-changing) index.
> >
> >> I'd be open to creating a new v2 backup endpoint (without adding TRA, etc. compatibility) if there was consensus on that approach to handling backcompat and on the specific appearance of the API
> >
> > On second thought, I'm going to flip-flop on this. Coming up with a
> > better v2 API for backup/restore will be easier *after* some of the
> > questions Jan raised (multi-collection? alias support? etc.) have been
> > dealt with. i.e. It's tough to decide between /v2/cluster/backups and
> > /v2/collections/<collectionName>/backups as alternatives until you
> > figure out whether we currently support multi-collection backup, or
> > want to in the near future. If people feel strongly or would veto the
> > vote otherwise, then I'll try my best. But otherwise I think we're
> > best served waiting until other stuff settles out to revisit larger v2
> > backup API changes.
> >
> > Best,
> >
> > Jason
> >
> > On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >>
> >> Hey all,
> >>
> >> I've put replies to everyone's questions below. Hope they help!
> >>
> >>> Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?
> >>
> >> Mike: The former - they're intended to hold metadata about all of the
> >> segments that are needed to restore to the given
> >> snapshot/commit-point. So it's likely to hold metadata about files
> >> just uploaded, as well as ones that were added to the blob by previous
> >> backups. I'll see if I can make that clearer in the file
> >> descriptions.
> >>
> >>> leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup
> >>
> >> Jan: Ultimately I agree with your concerns about scope, so I'd vote
> >> against trying to cover TRAs, multiple collection backups, etc. in
> >> this effort here.
> >>
> >> That aside though, I agree that the existing v2 backup API is a bit of
> >> a headscratcher. Why is it /v2/collections instead of
> >> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
> >> have something to do with aliases? Or did it end up there mostly by
> >> default? I'd be open to creating a new v2 backup endpoint (without
> >> adding TRA, etc. compatibility) if there was consensus on that
> >> approach to handling backcompat and on the specific appearance of the
> >> API. It would help with backcompat after all. Though if finding
> >> consensus bogs down it may not be worth the addition.
> >>
> >>> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed
> >>
> >> David: SIP-12 does address this, though maybe the writeup needs
> >> clarifying. The Delete Backup API includes a "purge" parameter which
> >> triggers GC activity. This probably works about the way you'd expect
> >> - Solr gets the list of UUID-named index files from the blob store,
> >> and then it compares that list to the set of UUID's referenced by any
> >> shard-metadata file (which requires reading all the shard-metadata
> >> files). This avoids adding to Solr's ZK state, but does so at the
> >> cost of requiring users to trigger sporadic cleanup manually instead
> >> of detecting orphans automatically like BlobDirectory does (assuming I
> >> understand that correctly).
> >>
> >> I'm def not saying this is the best approach necessarily. I like it,
> >> though it has downsides for sure. Just that there is a proposed
> >> approach that's easy to miss buried in the SIP.
> >>
> >> More broadly though - I share your sense that we should consider
> >> alignment. It may end up that Backup/Restore is different enough from
> >> the BlobDirectory usecase that it doesn't make sense, but it's at
> >> least worth figuring out. That's about as far as my understanding
> >> goes right now though. I'll read up on BlobDirectory while you absorb
> >> SIP-12 and maybe we can circle back to this shortly.
> >>
> >> Best,
> >>
> >> Jason
> >>
> >> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>
> >>> Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
> >>>
> >>>> I've also amended the backcompat/migration section to mention Jan's
> >>>> suggestion that the "incremental" features be exposed in the v2 API
> >>>> only. Though it's unclear to me whether that's still something people
> >>>> want since it turns out that we'll still have backcompat concerns with
> >>>> the existing v2 backup/restore APIs. So I've held off from
> >>>> removing/replacing the original plan.
> >>>
> >>> Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
> >>>
> >>> Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
> >>>
> >>> Jan
> >>>
> >>>> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>
> >>>> Hey, Happy New Year everybody.
> >>>>
> >>>> Some SIP updates based on the discussion above:
> >>>>
> >>>> I added v2 examples for each API to the SIP. Feedback welcome,
> >>>> especially on the v2 APIs that are net-new to this proposal (namely:
> >>>> "list backups" and "delete backup").
> >>>>
> >>>> I've also amended the backcompat/migration section to mention Jan's
> >>>> suggestion that the "incremental" features be exposed in the v2 API
> >>>> only. Though it's unclear to me whether that's still something people
> >>>> want since it turns out that we'll still have backcompat concerns with
> >>>> the existing v2 backup/restore APIs. So I've held off from
> >>>> removing/replacing the original plan.
> >>>>
> >>>> Link for convenience:
> >>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>>
> >>>> Best,
> >>>>
> >>>> Jason
> >>>>
> >>>>
> >>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>>
> >>>>> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
> >>>>>
> >>>>> Jan Høydahl
> >>>>>
> >>>>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>>
> >>>>>> ?
> >>>>>>>
> >>>>>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
> >>>>>>
> >>>>>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
> >>>>>> distinction you're making. (Do you have a pointer to something that'd
> >>>>>> fill me in?)
> >>>>>>
> >>>>>> To zoom in on "backup" as an example, the v2 API I'm referring to
> >>>>>> looks like: /v2/collections" -d '{ "backup-collection":
> >>>>>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> >>>>>> And it's included in the v2 "introspect" documentation returned by
> >>>>>> this API: /v2/collections/_introspect?command=backup-collection". To
> >>>>>> me that looked like a v2 API, but maybe path-aliases are also covered
> >>>>>> in the introspect docs.
> >>>>>>
> >>>>>> Jason
> >>>>>>
> >>>>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>>>>
> >>>>>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
> >>>>>>>
> >>>>>>> Jan Høydahl
> >>>>>>>
> >>>>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>>>>
> >>>>>>>> ?Hey guys,
> >>>>>>>>
> >>>>>>>> Following up to make sure I understand the specifics you're
> >>>>>>>> suggesting. You're proposing that:
> >>>>>>>>
> >>>>>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> >>>>>>>> be added in v2-form only.
> >>>>>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> >>>>>>>> made in V2-form only.
> >>>>>>>> 3. All existing v1 backup-related APIs be deprecated and left
> >>>>>>>> unchanged. Incremental backups will not be possible using the v1 API.
> >>>>>>>>
> >>>>>>>> I'm not against going this route if there's consensus around it. But
> >>>>>>>> I'm not 100% clear on how it means we don't need to worry about
> >>>>>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> >>>>>>>> API - I understand how leaving the v1 APIs untouched (other than
> >>>>>>>> deprecation) frees us of some backcompat concerns there, but we would
> >>>>>>>> still need to make tweaks to the v2 backup/restore APIs and would have
> >>>>>>>> to tread just as carefully there in terms of backcompat, afaict.
> >>>>>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> >>>>>>>> leave v2 changes to be made freely? I looked around to see if the v2
> >>>>>>>> APIs had any sort of "experimental" designation, but couldn't find
> >>>>>>>> that clearly stated anywhere. Am I missing something?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Jason
> >>>>>>>>
> >>>>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> V2 only please
> >>>>>>>>>
> >>>>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hey Jan, thanks for the review.
> >>>>>>>>>>
> >>>>>>>>>> I hadn't thought about the V2 API in connection to this work. You're
> >>>>>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
> >>>>>>>>>> V2 equivalents at the very least. I'll draft tentative details for
> >>>>>>>>>> these APIs on the SIP and we can refine things from there.
> >>>>>>>>>>
> >>>>>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
> >>>>>>>>>> changes to these v2 APIs. It is an elegant approach to the
> >>>>>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
> >>>>>>>>>> like. But it would let users create snapshot-based backups (and keep
> >>>>>>>>>> us maintaining that code) longer than there's any strict need to. And
> >>>>>>>>>> users are left on the less-efficient format by default. (By contrast,
> >>>>>>>>>> the current SIP has snapshot-backup creation being replaced by
> >>>>>>>>>> incremental-backup creation as soon as the latter is available.). Did
> >>>>>>>>>> you have a particular lifespan in mind for snapshot-based creation if
> >>>>>>>>>> we go with this approach?
> >>>>>>>>>>
> >>>>>>>>>> Jason
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Much needed! Thanks for initiating this Jason!
> >>>>>>>>>>>
> >>>>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> >>>>>>>>>>>
> >>>>>>>>>>> Jan
> >>>>>>>>>>>
> >>>>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hey all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> >>>>>>>>>>>> backup and restore functionality. While the "headline" improvement in
> >>>>>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> >>>>>>>>>>>> number of other improvements as well, including the addition of
> >>>>>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
> >>>>>>>>>>>> integration points with popular object storage APIs.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The SIP can be found here:
> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please read the SIP description and come back here for discussion. As
> >>>>>>>>>>>> the discussion progresses we will update the SIP page with any
> >>>>>>>>>>>> outcomes and eventually move things to a VOTE.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Looking forward to hearing your feedback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Jason
> >>>>>>>>>>>>
> >>>>>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> -----------------------------------------------------
> >>>>>>>>> Noble Paul
> >>>>>>>>>
> >>>>>>>>> ---------------------------------------------------------------------
> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>>
> >>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
Jan - I've modified the file-structure page to include support for
storing multiple collections within the same "location" per your
suggestion.

I've also clarified the file-structure page to state that _all_ files
required to restore a shard are recorded in the shard-metadata file
(regardless of whether the file was uploaded by the current backup or
a past one).

This DISCUSS thread has been open for a full month now, and it seems
like the feedback is winding down. Pending any objections, I'd like
to move out of the DISCUSS phase and start implementing. The "SIP
Process" page in Confluence mentions holding a VOTE thread for the
final proposal, but I can't find any examples of that actually being
done. Is that a required part of the process, or is the process page
out of date? IMO a VOTE seems slightly redundant, unless success on a
VOTE means that individual PRs can't be -1'd on design grounds?

Either way I'm happy to do whatever the process requires here. I'll
plan on starting the next step on Monday, whatever that needs to be.

Best,

Jason

On Thu, Jan 14, 2021 at 1:10 PM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>
> Yeah, that's a good point. I think the only change we'd need to make
> to the file-structure to support multi-collection down the road would
> be to introduce a top-level directories for each collection. I'll
> experiment with that and tweak the described file structure to handle
> that (assuming the testing pans out).
>
> Best,
>
> Jason
>
> On Thu, Jan 14, 2021 at 5:10 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> >
> > > It's tough to decide between /v2/cluster/backups and
> > > /v2/collections/<collectionName>/backups as alternatives until you
> > > figure out whether we currently support multi-collection backup, or
> > > want to in the near future.
> >
> > I suppose multi-collection / TRA support cold be expanded on later
> > by supporting e.g. collection=<regex> or alias=my-tra.
> >
> > However, the file layout chosen here dictates whether one named backup
> > will be capable of storing more than one collection in the future, so that's
> > perhaps something to consider. But if it gets too complicated we should
> > just delay it and redesign the storage structure once again when we cross
> > that bridge. I'll not veto the current suggestion.
> >
> > Jan
> >
> > > 12. jan. 2021 kl. 17:53 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >
> > > Hey all,
> > >
> > > Two follow ups on recent discussion.
> > >
> > > I reviewed the gc/ref-counting part of the BlobDirectory proposal on
> > > SOLR-15051 that David mentioned. We talked about it a bit offline and
> > > agreed that while an automatic gc mechanism is really needed for what
> > > he's trying to do, the requirements of the backup usecase are
> > > different enough that SIP-12 can get by with manually-triggered
> > > 'purging'. Mostly because infrequent static backups produce much less
> > > garbage than continually tracking all files for a (possibly
> > > ever-changing) index.
> > >
> > >> I'd be open to creating a new v2 backup endpoint (without adding TRA, etc. compatibility) if there was consensus on that approach to handling backcompat and on the specific appearance of the API
> > >
> > > On second thought, I'm going to flip-flop on this. Coming up with a
> > > better v2 API for backup/restore will be easier *after* some of the
> > > questions Jan raised (multi-collection? alias support? etc.) have been
> > > dealt with. i.e. It's tough to decide between /v2/cluster/backups and
> > > /v2/collections/<collectionName>/backups as alternatives until you
> > > figure out whether we currently support multi-collection backup, or
> > > want to in the near future. If people feel strongly or would veto the
> > > vote otherwise, then I'll try my best. But otherwise I think we're
> > > best served waiting until other stuff settles out to revisit larger v2
> > > backup API changes.
> > >
> > > Best,
> > >
> > > Jason
> > >
> > > On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> > >>
> > >> Hey all,
> > >>
> > >> I've put replies to everyone's questions below. Hope they help!
> > >>
> > >>> Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?
> > >>
> > >> Mike: The former - they're intended to hold metadata about all of the
> > >> segments that are needed to restore to the given
> > >> snapshot/commit-point. So it's likely to hold metadata about files
> > >> just uploaded, as well as ones that were added to the blob by previous
> > >> backups. I'll see if I can make that clearer in the file
> > >> descriptions.
> > >>
> > >>> leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup
> > >>
> > >> Jan: Ultimately I agree with your concerns about scope, so I'd vote
> > >> against trying to cover TRAs, multiple collection backups, etc. in
> > >> this effort here.
> > >>
> > >> That aside though, I agree that the existing v2 backup API is a bit of
> > >> a headscratcher. Why is it /v2/collections instead of
> > >> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
> > >> have something to do with aliases? Or did it end up there mostly by
> > >> default? I'd be open to creating a new v2 backup endpoint (without
> > >> adding TRA, etc. compatibility) if there was consensus on that
> > >> approach to handling backcompat and on the specific appearance of the
> > >> API. It would help with backcompat after all. Though if finding
> > >> consensus bogs down it may not be worth the addition.
> > >>
> > >>> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed
> > >>
> > >> David: SIP-12 does address this, though maybe the writeup needs
> > >> clarifying. The Delete Backup API includes a "purge" parameter which
> > >> triggers GC activity. This probably works about the way you'd expect
> > >> - Solr gets the list of UUID-named index files from the blob store,
> > >> and then it compares that list to the set of UUID's referenced by any
> > >> shard-metadata file (which requires reading all the shard-metadata
> > >> files). This avoids adding to Solr's ZK state, but does so at the
> > >> cost of requiring users to trigger sporadic cleanup manually instead
> > >> of detecting orphans automatically like BlobDirectory does (assuming I
> > >> understand that correctly).
> > >>
> > >> I'm def not saying this is the best approach necessarily. I like it,
> > >> though it has downsides for sure. Just that there is a proposed
> > >> approach that's easy to miss buried in the SIP.
> > >>
> > >> More broadly though - I share your sense that we should consider
> > >> alignment. It may end up that Backup/Restore is different enough from
> > >> the BlobDirectory usecase that it doesn't make sense, but it's at
> > >> least worth figuring out. That's about as far as my understanding
> > >> goes right now though. I'll read up on BlobDirectory while you absorb
> > >> SIP-12 and maybe we can circle back to this shortly.
> > >>
> > >> Best,
> > >>
> > >> Jason
> > >>
> > >> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>
> > >>> Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
> > >>>
> > >>>> I've also amended the backcompat/migration section to mention Jan's
> > >>>> suggestion that the "incremental" features be exposed in the v2 API
> > >>>> only. Though it's unclear to me whether that's still something people
> > >>>> want since it turns out that we'll still have backcompat concerns with
> > >>>> the existing v2 backup/restore APIs. So I've held off from
> > >>>> removing/replacing the original plan.
> > >>>
> > >>> Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
> > >>>
> > >>> Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
> > >>>
> > >>> Jan
> > >>>
> > >>>> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>
> > >>>> Hey, Happy New Year everybody.
> > >>>>
> > >>>> Some SIP updates based on the discussion above:
> > >>>>
> > >>>> I added v2 examples for each API to the SIP. Feedback welcome,
> > >>>> especially on the v2 APIs that are net-new to this proposal (namely:
> > >>>> "list backups" and "delete backup").
> > >>>>
> > >>>> I've also amended the backcompat/migration section to mention Jan's
> > >>>> suggestion that the "incremental" features be exposed in the v2 API
> > >>>> only. Though it's unclear to me whether that's still something people
> > >>>> want since it turns out that we'll still have backcompat concerns with
> > >>>> the existing v2 backup/restore APIs. So I've held off from
> > >>>> removing/replacing the original plan.
> > >>>>
> > >>>> Link for convenience:
> > >>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >>>>
> > >>>> Best,
> > >>>>
> > >>>> Jason
> > >>>>
> > >>>>
> > >>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>>>
> > >>>>> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
> > >>>>>
> > >>>>> Jan Høydahl
> > >>>>>
> > >>>>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>>>
> > >>>>>> ?
> > >>>>>>>
> > >>>>>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
> > >>>>>>
> > >>>>>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
> > >>>>>> distinction you're making. (Do you have a pointer to something that'd
> > >>>>>> fill me in?)
> > >>>>>>
> > >>>>>> To zoom in on "backup" as an example, the v2 API I'm referring to
> > >>>>>> looks like: /v2/collections" -d '{ "backup-collection":
> > >>>>>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
> > >>>>>> And it's included in the v2 "introspect" documentation returned by
> > >>>>>> this API: /v2/collections/_introspect?command=backup-collection". To
> > >>>>>> me that looked like a v2 API, but maybe path-aliases are also covered
> > >>>>>> in the introspect docs.
> > >>>>>>
> > >>>>>> Jason
> > >>>>>>
> > >>>>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>>>>>
> > >>>>>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
> > >>>>>>>
> > >>>>>>> Jan Høydahl
> > >>>>>>>
> > >>>>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>>>>>
> > >>>>>>>> ?Hey guys,
> > >>>>>>>>
> > >>>>>>>> Following up to make sure I understand the specifics you're
> > >>>>>>>> suggesting. You're proposing that:
> > >>>>>>>>
> > >>>>>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
> > >>>>>>>> be added in v2-form only.
> > >>>>>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
> > >>>>>>>> made in V2-form only.
> > >>>>>>>> 3. All existing v1 backup-related APIs be deprecated and left
> > >>>>>>>> unchanged. Incremental backups will not be possible using the v1 API.
> > >>>>>>>>
> > >>>>>>>> I'm not against going this route if there's consensus around it. But
> > >>>>>>>> I'm not 100% clear on how it means we don't need to worry about
> > >>>>>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
> > >>>>>>>> API - I understand how leaving the v1 APIs untouched (other than
> > >>>>>>>> deprecation) frees us of some backcompat concerns there, but we would
> > >>>>>>>> still need to make tweaks to the v2 backup/restore APIs and would have
> > >>>>>>>> to tread just as carefully there in terms of backcompat, afaict.
> > >>>>>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
> > >>>>>>>> leave v2 changes to be made freely? I looked around to see if the v2
> > >>>>>>>> APIs had any sort of "experimental" designation, but couldn't find
> > >>>>>>>> that clearly stated anywhere. Am I missing something?
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>>
> > >>>>>>>> Jason
> > >>>>>>>>
> > >>>>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> V2 only please
> > >>>>>>>>>
> > >>>>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Hey Jan, thanks for the review.
> > >>>>>>>>>>
> > >>>>>>>>>> I hadn't thought about the V2 API in connection to this work. You're
> > >>>>>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
> > >>>>>>>>>> V2 equivalents at the very least. I'll draft tentative details for
> > >>>>>>>>>> these APIs on the SIP and we can refine things from there.
> > >>>>>>>>>>
> > >>>>>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
> > >>>>>>>>>> changes to these v2 APIs. It is an elegant approach to the
> > >>>>>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
> > >>>>>>>>>> like. But it would let users create snapshot-based backups (and keep
> > >>>>>>>>>> us maintaining that code) longer than there's any strict need to. And
> > >>>>>>>>>> users are left on the less-efficient format by default. (By contrast,
> > >>>>>>>>>> the current SIP has snapshot-backup creation being replaced by
> > >>>>>>>>>> incremental-backup creation as soon as the latter is available.). Did
> > >>>>>>>>>> you have a particular lifespan in mind for snapshot-based creation if
> > >>>>>>>>>> we go with this approach?
> > >>>>>>>>>>
> > >>>>>>>>>> Jason
> > >>>>>>>>>>
> > >>>>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Much needed! Thanks for initiating this Jason!
> > >>>>>>>>>>>
> > >>>>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Jan
> > >>>>>>>>>>>
> > >>>>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Hey all,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
> > >>>>>>>>>>>> backup and restore functionality. While the "headline" improvement in
> > >>>>>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
> > >>>>>>>>>>>> number of other improvements as well, including the addition of
> > >>>>>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
> > >>>>>>>>>>>> integration points with popular object storage APIs.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> The SIP can be found here:
> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Please read the SIP description and come back here for discussion. As
> > >>>>>>>>>>>> the discussion progresses we will update the SIP page with any
> > >>>>>>>>>>>> outcomes and eventually move things to a VOTE.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Looking forward to hearing your feedback.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Jason
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>> -----------------------------------------------------
> > >>>>>>>>> Noble Paul
> > >>>>>>>>>
> > >>>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ---------------------------------------------------------------------
> > >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> ---------------------------------------------------------------------
> > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>>
> > >>>>>>
> > >>>>>> ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>>
> > >>>>>
> > >>>>> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>>
> > >>>>
> > >>>> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>>
> > >>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> > >>>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
I think embrace lazy consensus -- no formal vote. Announce your intention
to proceed with lazy consensus in two business days.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 15, 2021 at 9:32 AM Jason Gerlowski <gerlowskija@gmail.com>
wrote:

> Jan - I've modified the file-structure page to include support for
> storing multiple collections within the same "location" per your
> suggestion.
>
> I've also clarified the file-structure page to state that _all_ files
> required to restore a shard are recorded in the shard-metadata file
> (regardless of whether the file was uploaded by the current backup or
> a past one).
>
> This DISCUSS thread has been open for a full month now, and it seems
> like the feedback is winding down. Pending any objections, I'd like
> to move out of the DISCUSS phase and start implementing. The "SIP
> Process" page in Confluence mentions holding a VOTE thread for the
> final proposal, but I can't find any examples of that actually being
> done. Is that a required part of the process, or is the process page
> out of date? IMO a VOTE seems slightly redundant, unless success on a
> VOTE means that individual PRs can't be -1'd on design grounds?
>
> Either way I'm happy to do whatever the process requires here. I'll
> plan on starting the next step on Monday, whatever that needs to be.
>
> Best,
>
> Jason
>
> On Thu, Jan 14, 2021 at 1:10 PM Jason Gerlowski <gerlowskija@gmail.com>
> wrote:
> >
> > Yeah, that's a good point. I think the only change we'd need to make
> > to the file-structure to support multi-collection down the road would
> > be to introduce a top-level directories for each collection. I'll
> > experiment with that and tweak the described file structure to handle
> > that (assuming the testing pans out).
> >
> > Best,
> >
> > Jason
> >
> > On Thu, Jan 14, 2021 at 5:10 AM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> > >
> > > > It's tough to decide between /v2/cluster/backups and
> > > > /v2/collections/<collectionName>/backups as alternatives until you
> > > > figure out whether we currently support multi-collection backup, or
> > > > want to in the near future.
> > >
> > > I suppose multi-collection / TRA support cold be expanded on later
> > > by supporting e.g. collection=<regex> or alias=my-tra.
> > >
> > > However, the file layout chosen here dictates whether one named backup
> > > will be capable of storing more than one collection in the future, so
> that's
> > > perhaps something to consider. But if it gets too complicated we should
> > > just delay it and redesign the storage structure once again when we
> cross
> > > that bridge. I'll not veto the current suggestion.
> > >
> > > Jan
> > >
> > > > 12. jan. 2021 kl. 17:53 skrev Jason Gerlowski <gerlowskija@gmail.com
> >:
> > > >
> > > > Hey all,
> > > >
> > > > Two follow ups on recent discussion.
> > > >
> > > > I reviewed the gc/ref-counting part of the BlobDirectory proposal on
> > > > SOLR-15051 that David mentioned. We talked about it a bit offline
> and
> > > > agreed that while an automatic gc mechanism is really needed for what
> > > > he's trying to do, the requirements of the backup usecase are
> > > > different enough that SIP-12 can get by with manually-triggered
> > > > 'purging'. Mostly because infrequent static backups produce much
> less
> > > > garbage than continually tracking all files for a (possibly
> > > > ever-changing) index.
> > > >
> > > >> I'd be open to creating a new v2 backup endpoint (without adding
> TRA, etc. compatibility) if there was consensus on that approach to
> handling backcompat and on the specific appearance of the API
> > > >
> > > > On second thought, I'm going to flip-flop on this. Coming up with a
> > > > better v2 API for backup/restore will be easier *after* some of the
> > > > questions Jan raised (multi-collection? alias support? etc.) have
> been
> > > > dealt with. i.e. It's tough to decide between /v2/cluster/backups
> and
> > > > /v2/collections/<collectionName>/backups as alternatives until you
> > > > figure out whether we currently support multi-collection backup, or
> > > > want to in the near future. If people feel strongly or would veto
> the
> > > > vote otherwise, then I'll try my best. But otherwise I think we're
> > > > best served waiting until other stuff settles out to revisit larger
> v2
> > > > backup API changes.
> > > >
> > > > Best,
> > > >
> > > > Jason
> > > >
> > > > On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> > > >>
> > > >> Hey all,
> > > >>
> > > >> I've put replies to everyone's questions below. Hope they help!
> > > >>
> > > >>> Do the shard metadata files list all of the segments that make up
> the backup, or only the segments that were uploaded in this incremental
> update?
> > > >>
> > > >> Mike: The former - they're intended to hold metadata about all of
> the
> > > >> segments that are needed to restore to the given
> > > >> snapshot/commit-point. So it's likely to hold metadata about files
> > > >> just uploaded, as well as ones that were added to the blob by
> previous
> > > >> backups. I'll see if I can make that clearer in the file
> > > >> descriptions.
> > > >>
> > > >>> leave the old Backup/Restore API as-is, deprecated, and add a new
> one on /v2/cluster/backup
> > > >>
> > > >> Jan: Ultimately I agree with your concerns about scope, so I'd vote
> > > >> against trying to cover TRAs, multiple collection backups, etc. in
> > > >> this effort here.
> > > >>
> > > >> That aside though, I agree that the existing v2 backup API is a bit
> of
> > > >> a headscratcher. Why is it /v2/collections instead of
> > > >> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does
> it
> > > >> have something to do with aliases? Or did it end up there mostly by
> > > >> default? I'd be open to creating a new v2 backup endpoint (without
> > > >> adding TRA, etc. compatibility) if there was consensus on that
> > > >> approach to handling backcompat and on the specific appearance of
> the
> > > >> API. It would help with backcompat after all. Though if finding
> > > >> consensus bogs down it may not be worth the addition.
> > > >>
> > > >>> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory)
> ... We both want to store checksums and file lengths. ... Your proposal did
> not discuss how these files are GC'ed
> > > >>
> > > >> David: SIP-12 does address this, though maybe the writeup needs
> > > >> clarifying. The Delete Backup API includes a "purge" parameter
> which
> > > >> triggers GC activity. This probably works about the way you'd
> expect
> > > >> - Solr gets the list of UUID-named index files from the blob store,
> > > >> and then it compares that list to the set of UUID's referenced by
> any
> > > >> shard-metadata file (which requires reading all the shard-metadata
> > > >> files). This avoids adding to Solr's ZK state, but does so at the
> > > >> cost of requiring users to trigger sporadic cleanup manually instead
> > > >> of detecting orphans automatically like BlobDirectory does
> (assuming I
> > > >> understand that correctly).
> > > >>
> > > >> I'm def not saying this is the best approach necessarily. I like
> it,
> > > >> though it has downsides for sure. Just that there is a proposed
> > > >> approach that's easy to miss buried in the SIP.
> > > >>
> > > >> More broadly though - I share your sense that we should consider
> > > >> alignment. It may end up that Backup/Restore is different enough
> from
> > > >> the BlobDirectory usecase that it doesn't make sense, but it's at
> > > >> least worth figuring out. That's about as far as my understanding
> > > >> goes right now though. I'll read up on BlobDirectory while you
> absorb
> > > >> SIP-12 and maybe we can circle back to this shortly.
> > > >>
> > > >> Best,
> > > >>
> > > >> Jason
> > > >>
> > > >> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com>
> wrote:
> > > >>>
> > > >>> Jason, Shalin and Dat, thanks for the thorough work. This is an
> example for other SIPs to follow!
> > > >>>
> > > >>>> I've also amended the backcompat/migration section to mention
> Jan's
> > > >>>> suggestion that the "incremental" features be exposed in the v2
> API
> > > >>>> only. Though it's unclear to me whether that's still something
> people
> > > >>>> want since it turns out that we'll still have backcompat concerns
> with
> > > >>>> the existing v2 backup/restore APIs. So I've held off from
> > > >>>> removing/replacing the original plan.
> > > >>>
> > > >>> Since we already have v2 for the existing backup API, I guess my
> suggestion is not that 'clean' after all.
> > > >>>
> > > >>> Another approach would be to leave the old Backup/Restore API
> as-is, deprecated, and add a new one on /v2/cluster/backup, with support
> for backing up multiple collections in one go, or backup a TRA alias with
> hundreds of concrete "sub" collections. But as I write these words I
> imagine it probably is way outside the scope for this SIP which is large
> enough. Anyone even tried to backup a TRA with today's API?
> > > >>>
> > > >>> Jan
> > > >>>
> > > >>>> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > > >>>>
> > > >>>> Hey, Happy New Year everybody.
> > > >>>>
> > > >>>> Some SIP updates based on the discussion above:
> > > >>>>
> > > >>>> I added v2 examples for each API to the SIP. Feedback welcome,
> > > >>>> especially on the v2 APIs that are net-new to this proposal
> (namely:
> > > >>>> "list backups" and "delete backup").
> > > >>>>
> > > >>>> I've also amended the backcompat/migration section to mention
> Jan's
> > > >>>> suggestion that the "incremental" features be exposed in the v2
> API
> > > >>>> only. Though it's unclear to me whether that's still something
> people
> > > >>>> want since it turns out that we'll still have backcompat concerns
> with
> > > >>>> the existing v2 backup/restore APIs. So I've held off from
> > > >>>> removing/replacing the original plan.
> > > >>>>
> > > >>>> Link for convenience:
> > > >>>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Jason
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> > > >>>>>
> > > >>>>> Ok, that’s the one I was looking for, it’s not documented in the
> backup chapter of ref-guide :(
> > > >>>>>
> > > >>>>> Jan Høydahl
> > > >>>>>
> > > >>>>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > > >>>>>>
> > > >>>>>> ?
> > > >>>>>>>
> > > >>>>>>> We have a path alias to the old API ... but we don’t have a
> true v2 API spec for it, do we?
> > > >>>>>>
> > > >>>>>> Tbh I'm not yet familiar enough with the v2 APIs to understand
> the
> > > >>>>>> distinction you're making. (Do you have a pointer to something
> that'd
> > > >>>>>> fill me in?)
> > > >>>>>>
> > > >>>>>> To zoom in on "backup" as an example, the v2 API I'm referring
> to
> > > >>>>>> looks like: /v2/collections" -d '{ "backup-collection":
> > > >>>>>> {"collection": "books", "name": "asdf3", "location":
> "/tmp/foo"}}'.
> > > >>>>>> And it's included in the v2 "introspect" documentation returned
> by
> > > >>>>>> this API:
> /v2/collections/_introspect?command=backup-collection". To
> > > >>>>>> me that looked like a v2 API, but maybe path-aliases are also
> covered
> > > >>>>>> in the introspect docs.
> > > >>>>>>
> > > >>>>>> Jason
> > > >>>>>>
> > > >>>>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> > > >>>>>>>
> > > >>>>>>> Actually, don’t think we do have a v2 Backup/Restore API. We
> have a path alias to the old API which takes GET ...&action=backup... but
> we don’t have a true v2 API spec for it, do we? Where is that documented?
> > > >>>>>>>
> > > >>>>>>> Jan Høydahl
> > > >>>>>>>
> > > >>>>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > > >>>>>>>>
> > > >>>>>>>> ?Hey guys,
> > > >>>>>>>>
> > > >>>>>>>> Following up to make sure I understand the specifics you're
> > > >>>>>>>> suggesting. You're proposing that:
> > > >>>>>>>>
> > > >>>>>>>> 1. The brand new backup-related APIs (list-backups and
> delete-backup)
> > > >>>>>>>> be added in v2-form only.
> > > >>>>>>>> 2. Tweaks to existing backup-related APIs (create-backup,
> restore) be
> > > >>>>>>>> made in V2-form only.
> > > >>>>>>>> 3. All existing v1 backup-related APIs be deprecated and left
> > > >>>>>>>> unchanged. Incremental backups will not be possible using
> the v1 API.
> > > >>>>>>>>
> > > >>>>>>>> I'm not against going this route if there's consensus around
> it. But
> > > >>>>>>>> I'm not 100% clear on how it means we don't need to worry
> about
> > > >>>>>>>> backcompat. Backup and Restore currently exist as both a v1
> and a v2
> > > >>>>>>>> API - I understand how leaving the v1 APIs untouched (other
> than
> > > >>>>>>>> deprecation) frees us of some backcompat concerns there, but
> we would
> > > >>>>>>>> still need to make tweaks to the v2 backup/restore APIs and
> would have
> > > >>>>>>>> to tread just as carefully there in terms of backcompat,
> afaict.
> > > >>>>>>>> Unless Solr's backcompatibility guarantees only cover the v1
> API and
> > > >>>>>>>> leave v2 changes to be made freely? I looked around to see
> if the v2
> > > >>>>>>>> APIs had any sort of "experimental" designation, but couldn't
> find
> > > >>>>>>>> that clearly stated anywhere. Am I missing something?
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>>
> > > >>>>>>>> Jason
> > > >>>>>>>>
> > > >>>>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <
> noble.paul@gmail.com> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> , and implement the new imporved version as a V2-api only,
> and then deprecate the v1 API?
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> V2 only please
> > > >>>>>>>>>
> > > >>>>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <
> gerlowskija@gmail.com> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hey Jan, thanks for the review.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I hadn't thought about the V2 API in connection to this
> work. You're
> > > >>>>>>>>>> right though I think - the SIP proposes net-new APIs, so it
> should add
> > > >>>>>>>>>> V2 equivalents at the very least. I'll draft tentative
> details for
> > > >>>>>>>>>> these APIs on the SIP and we can refine things from there.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I'm more up in the air on your specific suggestion to
> restrict the SIP
> > > >>>>>>>>>> changes to these v2 APIs. It is an elegant approach to the
> > > >>>>>>>>>> backcompat, and it provides a carrot for v2 adoption - both
> of which I
> > > >>>>>>>>>> like. But it would let users create snapshot-based backups
> (and keep
> > > >>>>>>>>>> us maintaining that code) longer than there's any strict
> need to. And
> > > >>>>>>>>>> users are left on the less-efficient format by default.
> (By contrast,
> > > >>>>>>>>>> the current SIP has snapshot-backup creation being replaced
> by
> > > >>>>>>>>>> incremental-backup creation as soon as the latter is
> available.). Did
> > > >>>>>>>>>> you have a particular lifespan in mind for snapshot-based
> creation if
> > > >>>>>>>>>> we go with this approach?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Jason
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <
> jan.asf@cominvent.com> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Much needed! Thanks for initiating this Jason!
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is
> used for creation and deletion, would it be an idea to leave the old
> backup/resotre APIs as-is, and implement the new imporved version as a
> V2-api only, and then deprecate the v1 API? Then we don't need to worry
> about back-compat, and we get a head-start on converting the COLLECTION API
> to v2 style.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Jan
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <
> gerlowskija@gmail.com>:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Hey all,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This morning I published SIP-12, which proposes an
> overhaul of Solr's
> > > >>>>>>>>>>>> backup and restore functionality. While the "headline"
> improvement in
> > > >>>>>>>>>>>> this SIP is a change to do backups incrementally, it
> bundles in a
> > > >>>>>>>>>>>> number of other improvements as well, including the
> addition of
> > > >>>>>>>>>>>> corruption checks, APIs to list and delete backups, and
> stronger
> > > >>>>>>>>>>>> integration points with popular object storage APIs.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> The SIP can be found here:
> > > >>>>>>>>>>>>
> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Please read the SIP description and come back here for
> discussion. As
> > > >>>>>>>>>>>> the discussion progresses we will update the SIP page
> with any
> > > >>>>>>>>>>>> outcomes and eventually move things to a VOTE.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Looking forward to hearing your feedback.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Jason
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>>>>>>> For additional commands, e-mail:
> dev-help@lucene.apache.org
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>>>>>> For additional commands, e-mail:
> dev-help@lucene.apache.org
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> -----------------------------------------------------
> > > >>>>>>>>> Noble Paul
> > > >>>>>>>>>
> > > >>>>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> ---------------------------------------------------------------------
> > > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> ---------------------------------------------------------------------
> > > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> ---------------------------------------------------------------------
> > > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>>
> > > >>>>
> > > >>>>
> ---------------------------------------------------------------------
> > > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> ---------------------------------------------------------------------
> > > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >>> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>>
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: dev-help@lucene.apache.org
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
Re: [DISCUSS] SIP-12: Incremental Backup and Restore [ In reply to ]
> Announce your intention to proceed with lazy consensus in two business days.

I intend to proceed then. : )

I'm going to file JIRAs for the SIP today and start pushing up PRs.
Based on how this discussion went I also plan on updating the "SIP"
process page to say that lazy consensus can be used over a strict VOTE
if no one objects.

Thanks everyone for the attention and feedback.

Jason

On Fri, Jan 15, 2021 at 10:43 AM David Smiley <dsmiley@apache.org> wrote:
>
> I think embrace lazy consensus -- no formal vote. Announce your intention to proceed with lazy consensus in two business days.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 15, 2021 at 9:32 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>>
>> Jan - I've modified the file-structure page to include support for
>> storing multiple collections within the same "location" per your
>> suggestion.
>>
>> I've also clarified the file-structure page to state that _all_ files
>> required to restore a shard are recorded in the shard-metadata file
>> (regardless of whether the file was uploaded by the current backup or
>> a past one).
>>
>> This DISCUSS thread has been open for a full month now, and it seems
>> like the feedback is winding down. Pending any objections, I'd like
>> to move out of the DISCUSS phase and start implementing. The "SIP
>> Process" page in Confluence mentions holding a VOTE thread for the
>> final proposal, but I can't find any examples of that actually being
>> done. Is that a required part of the process, or is the process page
>> out of date? IMO a VOTE seems slightly redundant, unless success on a
>> VOTE means that individual PRs can't be -1'd on design grounds?
>>
>> Either way I'm happy to do whatever the process requires here. I'll
>> plan on starting the next step on Monday, whatever that needs to be.
>>
>> Best,
>>
>> Jason
>>
>> On Thu, Jan 14, 2021 at 1:10 PM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>> >
>> > Yeah, that's a good point. I think the only change we'd need to make
>> > to the file-structure to support multi-collection down the road would
>> > be to introduce a top-level directories for each collection. I'll
>> > experiment with that and tweak the described file structure to handle
>> > that (assuming the testing pans out).
>> >
>> > Best,
>> >
>> > Jason
>> >
>> > On Thu, Jan 14, 2021 at 5:10 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> > >
>> > > > It's tough to decide between /v2/cluster/backups and
>> > > > /v2/collections/<collectionName>/backups as alternatives until you
>> > > > figure out whether we currently support multi-collection backup, or
>> > > > want to in the near future.
>> > >
>> > > I suppose multi-collection / TRA support cold be expanded on later
>> > > by supporting e.g. collection=<regex> or alias=my-tra.
>> > >
>> > > However, the file layout chosen here dictates whether one named backup
>> > > will be capable of storing more than one collection in the future, so that's
>> > > perhaps something to consider. But if it gets too complicated we should
>> > > just delay it and redesign the storage structure once again when we cross
>> > > that bridge. I'll not veto the current suggestion.
>> > >
>> > > Jan
>> > >
>> > > > 12. jan. 2021 kl. 17:53 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> > > >
>> > > > Hey all,
>> > > >
>> > > > Two follow ups on recent discussion.
>> > > >
>> > > > I reviewed the gc/ref-counting part of the BlobDirectory proposal on
>> > > > SOLR-15051 that David mentioned. We talked about it a bit offline and
>> > > > agreed that while an automatic gc mechanism is really needed for what
>> > > > he's trying to do, the requirements of the backup usecase are
>> > > > different enough that SIP-12 can get by with manually-triggered
>> > > > 'purging'. Mostly because infrequent static backups produce much less
>> > > > garbage than continually tracking all files for a (possibly
>> > > > ever-changing) index.
>> > > >
>> > > >> I'd be open to creating a new v2 backup endpoint (without adding TRA, etc. compatibility) if there was consensus on that approach to handling backcompat and on the specific appearance of the API
>> > > >
>> > > > On second thought, I'm going to flip-flop on this. Coming up with a
>> > > > better v2 API for backup/restore will be easier *after* some of the
>> > > > questions Jan raised (multi-collection? alias support? etc.) have been
>> > > > dealt with. i.e. It's tough to decide between /v2/cluster/backups and
>> > > > /v2/collections/<collectionName>/backups as alternatives until you
>> > > > figure out whether we currently support multi-collection backup, or
>> > > > want to in the near future. If people feel strongly or would veto the
>> > > > vote otherwise, then I'll try my best. But otherwise I think we're
>> > > > best served waiting until other stuff settles out to revisit larger v2
>> > > > backup API changes.
>> > > >
>> > > > Best,
>> > > >
>> > > > Jason
>> > > >
>> > > > On Mon, Jan 11, 2021 at 10:41 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>> > > >>
>> > > >> Hey all,
>> > > >>
>> > > >> I've put replies to everyone's questions below. Hope they help!
>> > > >>
>> > > >>> Do the shard metadata files list all of the segments that make up the backup, or only the segments that were uploaded in this incremental update?
>> > > >>
>> > > >> Mike: The former - they're intended to hold metadata about all of the
>> > > >> segments that are needed to restore to the given
>> > > >> snapshot/commit-point. So it's likely to hold metadata about files
>> > > >> just uploaded, as well as ones that were added to the blob by previous
>> > > >> backups. I'll see if I can make that clearer in the file
>> > > >> descriptions.
>> > > >>
>> > > >>> leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup
>> > > >>
>> > > >> Jan: Ultimately I agree with your concerns about scope, so I'd vote
>> > > >> against trying to cover TRAs, multiple collection backups, etc. in
>> > > >> this effort here.
>> > > >>
>> > > >> That aside though, I agree that the existing v2 backup API is a bit of
>> > > >> a headscratcher. Why is it /v2/collections instead of
>> > > >> /v2/collections/<collectionName> or a subpath of /v2/cluster? Does it
>> > > >> have something to do with aliases? Or did it end up there mostly by
>> > > >> default? I'd be open to creating a new v2 backup endpoint (without
>> > > >> adding TRA, etc. compatibility) if there was consensus on that
>> > > >> approach to handling backcompat and on the specific appearance of the
>> > > >> API. It would help with backcompat after all. Though if finding
>> > > >> consensus bogs down it may not be worth the addition.
>> > > >>
>> > > >>> I know you've seen SOLR-15051 (Shared storage -- BlobDirectory) ... We both want to store checksums and file lengths. ... Your proposal did not discuss how these files are GC'ed
>> > > >>
>> > > >> David: SIP-12 does address this, though maybe the writeup needs
>> > > >> clarifying. The Delete Backup API includes a "purge" parameter which
>> > > >> triggers GC activity. This probably works about the way you'd expect
>> > > >> - Solr gets the list of UUID-named index files from the blob store,
>> > > >> and then it compares that list to the set of UUID's referenced by any
>> > > >> shard-metadata file (which requires reading all the shard-metadata
>> > > >> files). This avoids adding to Solr's ZK state, but does so at the
>> > > >> cost of requiring users to trigger sporadic cleanup manually instead
>> > > >> of detecting orphans automatically like BlobDirectory does (assuming I
>> > > >> understand that correctly).
>> > > >>
>> > > >> I'm def not saying this is the best approach necessarily. I like it,
>> > > >> though it has downsides for sure. Just that there is a proposed
>> > > >> approach that's easy to miss buried in the SIP.
>> > > >>
>> > > >> More broadly though - I share your sense that we should consider
>> > > >> alignment. It may end up that Backup/Restore is different enough from
>> > > >> the BlobDirectory usecase that it doesn't make sense, but it's at
>> > > >> least worth figuring out. That's about as far as my understanding
>> > > >> goes right now though. I'll read up on BlobDirectory while you absorb
>> > > >> SIP-12 and maybe we can circle back to this shortly.
>> > > >>
>> > > >> Best,
>> > > >>
>> > > >> Jason
>> > > >>
>> > > >> On Sun, Jan 10, 2021 at 7:20 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> > > >>>
>> > > >>> Jason, Shalin and Dat, thanks for the thorough work. This is an example for other SIPs to follow!
>> > > >>>
>> > > >>>> I've also amended the backcompat/migration section to mention Jan's
>> > > >>>> suggestion that the "incremental" features be exposed in the v2 API
>> > > >>>> only. Though it's unclear to me whether that's still something people
>> > > >>>> want since it turns out that we'll still have backcompat concerns with
>> > > >>>> the existing v2 backup/restore APIs. So I've held off from
>> > > >>>> removing/replacing the original plan.
>> > > >>>
>> > > >>> Since we already have v2 for the existing backup API, I guess my suggestion is not that 'clean' after all.
>> > > >>>
>> > > >>> Another approach would be to leave the old Backup/Restore API as-is, deprecated, and add a new one on /v2/cluster/backup, with support for backing up multiple collections in one go, or backup a TRA alias with hundreds of concrete "sub" collections. But as I write these words I imagine it probably is way outside the scope for this SIP which is large enough. Anyone even tried to backup a TRA with today's API?
>> > > >>>
>> > > >>> Jan
>> > > >>>
>> > > >>>> 5. jan. 2021 kl. 15:55 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> > > >>>>
>> > > >>>> Hey, Happy New Year everybody.
>> > > >>>>
>> > > >>>> Some SIP updates based on the discussion above:
>> > > >>>>
>> > > >>>> I added v2 examples for each API to the SIP. Feedback welcome,
>> > > >>>> especially on the v2 APIs that are net-new to this proposal (namely:
>> > > >>>> "list backups" and "delete backup").
>> > > >>>>
>> > > >>>> I've also amended the backcompat/migration section to mention Jan's
>> > > >>>> suggestion that the "incremental" features be exposed in the v2 API
>> > > >>>> only. Though it's unclear to me whether that's still something people
>> > > >>>> want since it turns out that we'll still have backcompat concerns with
>> > > >>>> the existing v2 backup/restore APIs. So I've held off from
>> > > >>>> removing/replacing the original plan.
>> > > >>>>
>> > > >>>> Link for convenience:
>> > > >>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> > > >>>>
>> > > >>>> Best,
>> > > >>>>
>> > > >>>> Jason
>> > > >>>>
>> > > >>>>
>> > > >>>> On Thu, Dec 24, 2020 at 8:11 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> > > >>>>>
>> > > >>>>> Ok, that’s the one I was looking for, it’s not documented in the backup chapter of ref-guide :(
>> > > >>>>>
>> > > >>>>> Jan Høydahl
>> > > >>>>>
>> > > >>>>>> 23. des. 2020 kl. 17:10 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> > > >>>>>>
>> > > >>>>>> ?
>> > > >>>>>>>
>> > > >>>>>>> We have a path alias to the old API ... but we don’t have a true v2 API spec for it, do we?
>> > > >>>>>>
>> > > >>>>>> Tbh I'm not yet familiar enough with the v2 APIs to understand the
>> > > >>>>>> distinction you're making. (Do you have a pointer to something that'd
>> > > >>>>>> fill me in?)
>> > > >>>>>>
>> > > >>>>>> To zoom in on "backup" as an example, the v2 API I'm referring to
>> > > >>>>>> looks like: /v2/collections" -d '{ "backup-collection":
>> > > >>>>>> {"collection": "books", "name": "asdf3", "location": "/tmp/foo"}}'.
>> > > >>>>>> And it's included in the v2 "introspect" documentation returned by
>> > > >>>>>> this API: /v2/collections/_introspect?command=backup-collection". To
>> > > >>>>>> me that looked like a v2 API, but maybe path-aliases are also covered
>> > > >>>>>> in the introspect docs.
>> > > >>>>>>
>> > > >>>>>> Jason
>> > > >>>>>>
>> > > >>>>>>> On Wed, Dec 23, 2020 at 10:29 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> > > >>>>>>>
>> > > >>>>>>> Actually, don’t think we do have a v2 Backup/Restore API. We have a path alias to the old API which takes GET ...&action=backup... but we don’t have a true v2 API spec for it, do we? Where is that documented?
>> > > >>>>>>>
>> > > >>>>>>> Jan Høydahl
>> > > >>>>>>>
>> > > >>>>>>>>> 22. des. 2020 kl. 18:04 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> > > >>>>>>>>
>> > > >>>>>>>> ?Hey guys,
>> > > >>>>>>>>
>> > > >>>>>>>> Following up to make sure I understand the specifics you're
>> > > >>>>>>>> suggesting. You're proposing that:
>> > > >>>>>>>>
>> > > >>>>>>>> 1. The brand new backup-related APIs (list-backups and delete-backup)
>> > > >>>>>>>> be added in v2-form only.
>> > > >>>>>>>> 2. Tweaks to existing backup-related APIs (create-backup, restore) be
>> > > >>>>>>>> made in V2-form only.
>> > > >>>>>>>> 3. All existing v1 backup-related APIs be deprecated and left
>> > > >>>>>>>> unchanged. Incremental backups will not be possible using the v1 API.
>> > > >>>>>>>>
>> > > >>>>>>>> I'm not against going this route if there's consensus around it. But
>> > > >>>>>>>> I'm not 100% clear on how it means we don't need to worry about
>> > > >>>>>>>> backcompat. Backup and Restore currently exist as both a v1 and a v2
>> > > >>>>>>>> API - I understand how leaving the v1 APIs untouched (other than
>> > > >>>>>>>> deprecation) frees us of some backcompat concerns there, but we would
>> > > >>>>>>>> still need to make tweaks to the v2 backup/restore APIs and would have
>> > > >>>>>>>> to tread just as carefully there in terms of backcompat, afaict.
>> > > >>>>>>>> Unless Solr's backcompatibility guarantees only cover the v1 API and
>> > > >>>>>>>> leave v2 changes to be made freely? I looked around to see if the v2
>> > > >>>>>>>> APIs had any sort of "experimental" designation, but couldn't find
>> > > >>>>>>>> that clearly stated anywhere. Am I missing something?
>> > > >>>>>>>>
>> > > >>>>>>>> Best,
>> > > >>>>>>>>
>> > > >>>>>>>> Jason
>> > > >>>>>>>>
>> > > >>>>>>>>> On Tue, Dec 22, 2020 at 2:49 AM Noble Paul <noble.paul@gmail.com> wrote:
>> > > >>>>>>>>>
>> > > >>>>>>>>>> , and implement the new imporved version as a V2-api only, and then deprecate the v1 API?
>> > > >>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>> V2 only please
>> > > >>>>>>>>>
>> > > >>>>>>>>>> On Tue, Dec 22, 2020 at 1:34 AM Jason Gerlowski <gerlowskija@gmail.com> wrote:
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> Hey Jan, thanks for the review.
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> I hadn't thought about the V2 API in connection to this work. You're
>> > > >>>>>>>>>> right though I think - the SIP proposes net-new APIs, so it should add
>> > > >>>>>>>>>> V2 equivalents at the very least. I'll draft tentative details for
>> > > >>>>>>>>>> these APIs on the SIP and we can refine things from there.
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> I'm more up in the air on your specific suggestion to restrict the SIP
>> > > >>>>>>>>>> changes to these v2 APIs. It is an elegant approach to the
>> > > >>>>>>>>>> backcompat, and it provides a carrot for v2 adoption - both of which I
>> > > >>>>>>>>>> like. But it would let users create snapshot-based backups (and keep
>> > > >>>>>>>>>> us maintaining that code) longer than there's any strict need to. And
>> > > >>>>>>>>>> users are left on the less-efficient format by default. (By contrast,
>> > > >>>>>>>>>> the current SIP has snapshot-backup creation being replaced by
>> > > >>>>>>>>>> incremental-backup creation as soon as the latter is available.). Did
>> > > >>>>>>>>>> you have a particular lifespan in mind for snapshot-based creation if
>> > > >>>>>>>>>> we go with this approach?
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> Jason
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> On Thu, Dec 17, 2020 at 3:54 PM Jan Høydahl <jan.asf@cominvent.com> wrote:
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> Much needed! Thanks for initiating this Jason!
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> As we want to move away from v1 APIs where a HTTP GET is used for creation and deletion, would it be an idea to leave the old backup/resotre APIs as-is, and implement the new imporved version as a V2-api only, and then deprecate the v1 API? Then we don't need to worry about back-compat, and we get a head-start on converting the COLLECTION API to v2 style.
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> Jan
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>>> 15. des. 2020 kl. 15:48 skrev Jason Gerlowski <gerlowskija@gmail.com>:
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Hey all,
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> This morning I published SIP-12, which proposes an overhaul of Solr's
>> > > >>>>>>>>>>>> backup and restore functionality. While the "headline" improvement in
>> > > >>>>>>>>>>>> this SIP is a change to do backups incrementally, it bundles in a
>> > > >>>>>>>>>>>> number of other improvements as well, including the addition of
>> > > >>>>>>>>>>>> corruption checks, APIs to list and delete backups, and stronger
>> > > >>>>>>>>>>>> integration points with popular object storage APIs.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> The SIP can be found here:
>> > > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Please read the SIP description and come back here for discussion. As
>> > > >>>>>>>>>>>> the discussion progresses we will update the SIP page with any
>> > > >>>>>>>>>>>> outcomes and eventually move things to a VOTE.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Looking forward to hearing your feedback.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Best,
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Jason
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>> --
>> > > >>>>>>>>> -----------------------------------------------------
>> > > >>>>>>>>> Noble Paul
>> > > >>>>>>>>>
>> > > >>>>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>>
>> > > >>>>>>>
>> > > >>>>>>> ---------------------------------------------------------------------
>> > > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>> ---------------------------------------------------------------------
>> > > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>>
>> > > >>>>>
>> > > >>>>> ---------------------------------------------------------------------
>> > > >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>>
>> > > >>>>
>> > > >>>> ---------------------------------------------------------------------
>> > > >>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>>
>> > > >>>
>> > > >>>
>> > > >>> ---------------------------------------------------------------------
>> > > >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >>>
>> > > >
>> > > > ---------------------------------------------------------------------
>> > > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > > For additional commands, e-mail: dev-help@lucene.apache.org
>> > > >
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>> > >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

1 2  View All