Mailing List Archive: volume move and SFO

volume move and SFO

Ian.Ehrenwald at hbgusa

Aug 8, 2018, 7:30 AM

Post #1 of 10 (1804 views)

Good morning
I have a four node cluster, nodes1/2 are SAS/SATA and nodes 3/4 are AFF. I have a long running volume move going from node 1 to node 4. Long running, like 30TB+, and it's about 60% done. I need to do some hardware maintenance on nodes 1 and 2 tomorrow evening (install additional FlashCache cards). Will a takeover of node 2 by node 1, then a takeover of node 1 by node 2, interrupt this volume move? I can't seem to find much in the way of documentation about what happens during a SFO and a volume move, but it's possible I'm just not looking hard enough. Thanks for any insights.

Ian Ehrenwald
Senior Infrastructure Engineer
Hachette Book Group, Inc.
1.617.263.1948 / ian.ehrenwald@hbgusa.com

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

siggins at gmail

Aug 8, 2018, 10:11 AM

Post #2 of 10 (1801 views)

Ian,

Just a suggestion (its been a while but I think this is how I removed the
throttle in 9.1):
volume move governor*> ?
modify *Modify the governor configuration
show *Display the governor configuration
https://community.netapp.com/t5/Data-ONTAP-Discussions/How-many-vol-move-operations-can-be-active-at-same-time/td-p/129331

You should be able to move that data in a pretty rapid period of time. I've
noticed when upgrading to 9.1 the throttle is definitely more visible --
even when removing the throttle there isn't a noticeable impact.

I would suggest against keeping the vol move running during the takeover,
if its even possible.

On Wed, Aug 8, 2018 at 10:33 AM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com>
wrote:

> Good morning
> I have a four node cluster, nodes1/2 are SAS/SATA and nodes 3/4 are AFF.
> I have a long running volume move going from node 1 to node 4. Long
> running, like 30TB+, and it's about 60% done. I need to do some hardware
> maintenance on nodes 1 and 2 tomorrow evening (install additional
> FlashCache cards). Will a takeover of node 2 by node 1, then a takeover of
> node 1 by node 2, interrupt this volume move? I can't seem to find much in
> the way of documentation about what happens during a SFO and a volume move,
> but it's possible I'm just not looking hard enough. Thanks for any
> insights.
>
>
> Ian Ehrenwald
> Senior Infrastructure Engineer
> Hachette Book Group, Inc.
> 1.617.263.1948 / ian.ehrenwald@hbgusa.com
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>

Re: volume move and SFO [ In reply to ]

Ian.Ehrenwald at hbgusa

Aug 8, 2018, 10:35 AM

Post #3 of 10 (1801 views)

Hi Douglas
Thanks for writing. If I am understanding that governor correctly, that is for the number of concurrent moves? In this specific instance, I'm moving one volume of size 30+ TB, so I don't think it is entirely applicable for the situation. Definitely correct me if I'm wrong, though.

That being said, when the source aggregate is not under high load, I am able to get 400MB/s or higher in replication throughput which is pretty cool.

There IS a Snapmirror speed throttle that I would bump against occasionally, and that was addressed with a "setflag repl_throttle_enable 0" locally on each node while in diag mode. That really did make a difference in Snapmirror speed, enough of a difference that Jeffrey Steiner @ NetApp did some poking around internally to see why it's enabled in the first place. I don't recall the outcome of that poking.

Either way, I guess we'll find out what happens when there's a SFO while a volume move is happening?

________________________________________
From: Douglas Siggins <siggins@gmail.com>
Sent: Wednesday, August 8, 2018 1:11:31 PM
To: Ian Ehrenwald
Cc: toasters@teaparty.net
Subject: Re: volume move and SFO

Ian,

Just a suggestion (its been a while but I think this is how I removed the throttle in 9.1):
volume move governor*> ?
modify *Modify the governor configuration
show *Display the governor configuration
https://community.netapp.com/t5/Data-ONTAP-Discussions/How-many-vol-move-operations-can-be-active-at-same-time/td-p/129331<https://protect-us.mimecast.com/s/_22xC1wpBwSpRxmOSLY-3g?domain=community.netapp.com>

You should be able to move that data in a pretty rapid period of time. I've noticed when upgrading to 9.1 the throttle is definitely more visible -- even when removing the throttle there isn't a noticeable impact.

I would suggest against keeping the vol move running during the takeover, if its even possible.

On Wed, Aug 8, 2018 at 10:33 AM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com<mailto:Ian.Ehrenwald@hbgusa.com>> wrote:
Good morning
I have a four node cluster, nodes1/2 are SAS/SATA and nodes 3/4 are AFF. I have a long running volume move going from node 1 to node 4. Long running, like 30TB+, and it's about 60% done. I need to do some hardware maintenance on nodes 1 and 2 tomorrow evening (install additional FlashCache cards). Will a takeover of node 2 by node 1, then a takeover of node 1 by node 2, interrupt this volume move? I can't seem to find much in the way of documentation about what happens during a SFO and a volume move, but it's possible I'm just not looking hard enough. Thanks for any insights.

Ian Ehrenwald
Senior Infrastructure Engineer
Hachette Book Group, Inc.
1.617.263.1948 / ian.ehrenwald@hbgusa.com<mailto:ian.ehrenwald@hbgusa.com>

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters<https://protect-us.mimecast.com/s/Ii7BC2kq1kfkjWG6C1mJ0Y?domain=teaparty.net>

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

siggins at gmail

Aug 8, 2018, 10:49 AM

Post #4 of 10 (1801 views)

I was thinking throttle, but forgot the exact command.

Yes 400 MB/s is typically what I see with 10G.

I'm thinking itll make you cancel the vol move (without an override), but
either way just make sure its not finalizing. I've been in situations where
vol moves have not cutover properly for one reason or another, and IO on
the source stops.

http://www.datacenterdude.com/storage/partial-givebacks-storage-failovers-netapps-clustered-data-ontap/

This does mention vetoes ....

On Wed, Aug 8, 2018 at 1:35 PM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com>
wrote:

> Hi Douglas
> Thanks for writing. If I am understanding that governor correctly, that
> is for the number of concurrent moves? In this specific instance, I'm
> moving one volume of size 30+ TB, so I don't think it is entirely
> applicable for the situation. Definitely correct me if I'm wrong, though.
>
> That being said, when the source aggregate is not under high load, I am
> able to get 400MB/s or higher in replication throughput which is pretty
> cool.
>
> There IS a Snapmirror speed throttle that I would bump against
> occasionally, and that was addressed with a "setflag repl_throttle_enable
> 0" locally on each node while in diag mode. That really did make a
> difference in Snapmirror speed, enough of a difference that Jeffrey Steiner
> @ NetApp did some poking around internally to see why it's enabled in the
> first place. I don't recall the outcome of that poking.
>
> Either way, I guess we'll find out what happens when there's a SFO while a
> volume move is happening?
>
>
> ________________________________________
> From: Douglas Siggins <siggins@gmail.com>
> Sent: Wednesday, August 8, 2018 1:11:31 PM
> To: Ian Ehrenwald
> Cc: toasters@teaparty.net
> Subject: Re: volume move and SFO
>
> Ian,
>
> Just a suggestion (its been a while but I think this is how I removed the
> throttle in 9.1):
> volume move governor*> ?
> modify *Modify the governor configuration
> show *Display the governor configuration
>
> https://community.netapp.com/t5/Data-ONTAP-Discussions/How-many-vol-move-operations-can-be-active-at-same-time/td-p/129331
> <
> https://protect-us.mimecast.com/s/_22xC1wpBwSpRxmOSLY-3g?domain=community.netapp.com
> >
>
> You should be able to move that data in a pretty rapid period of time.
> I've noticed when upgrading to 9.1 the throttle is definitely more visible
> -- even when removing the throttle there isn't a noticeable impact.
>
> I would suggest against keeping the vol move running during the takeover,
> if its even possible.
>
> On Wed, Aug 8, 2018 at 10:33 AM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com
> <mailto:Ian.Ehrenwald@hbgusa.com>> wrote:
> Good morning
> I have a four node cluster, nodes1/2 are SAS/SATA and nodes 3/4 are AFF.
> I have a long running volume move going from node 1 to node 4. Long
> running, like 30TB+, and it's about 60% done. I need to do some hardware
> maintenance on nodes 1 and 2 tomorrow evening (install additional
> FlashCache cards). Will a takeover of node 2 by node 1, then a takeover of
> node 1 by node 2, interrupt this volume move? I can't seem to find much in
> the way of documentation about what happens during a SFO and a volume move,
> but it's possible I'm just not looking hard enough. Thanks for any
> insights.
>
>
> Ian Ehrenwald
> Senior Infrastructure Engineer
> Hachette Book Group, Inc.
> 1.617.263.1948 / ian.ehrenwald@hbgusa.com<mailto:ian.ehrenwald@hbgusa.com>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net<mailto:Toasters@teaparty.net>
> http://www.teaparty.net/mailman/listinfo/toasters<
> https://protect-us.mimecast.com/s/Ii7BC2kq1kfkjWG6C1mJ0Y?domain=teaparty.net
> >
>

Re: volume move and SFO [ In reply to ]

cmgossett at gmail

Aug 8, 2018, 10:59 AM

Post #5 of 10 (1801 views)

How long did it take to get to your 60% point that you reference? Some
quick math says 30TB at 400MB/s should complete in about 22 hours. If
you're 60 percent done, and you have a day and a half, then it should
complete before tomorrow night, right?

On Wed, Aug 8, 2018 at 12:49 PM, Douglas Siggins <siggins@gmail.com> wrote:

> I was thinking throttle, but forgot the exact command.
>
> Yes 400 MB/s is typically what I see with 10G.
>
> I'm thinking itll make you cancel the vol move (without an override), but
> either way just make sure its not finalizing. I've been in situations where
> vol moves have not cutover properly for one reason or another, and IO on
> the source stops.
>
> http://www.datacenterdude.com/storage/partial-givebacks-
> storage-failovers-netapps-clustered-data-ontap/
>
> This does mention vetoes ....
>
>
>
>
> On Wed, Aug 8, 2018 at 1:35 PM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com>
> wrote:
>
>> Hi Douglas
>> Thanks for writing. If I am understanding that governor correctly, that
>> is for the number of concurrent moves? In this specific instance, I'm
>> moving one volume of size 30+ TB, so I don't think it is entirely
>> applicable for the situation. Definitely correct me if I'm wrong, though.
>>
>> That being said, when the source aggregate is not under high load, I am
>> able to get 400MB/s or higher in replication throughput which is pretty
>> cool.
>>
>> There IS a Snapmirror speed throttle that I would bump against
>> occasionally, and that was addressed with a "setflag repl_throttle_enable
>> 0" locally on each node while in diag mode. That really did make a
>> difference in Snapmirror speed, enough of a difference that Jeffrey Steiner
>> @ NetApp did some poking around internally to see why it's enabled in the
>> first place. I don't recall the outcome of that poking.
>>
>> Either way, I guess we'll find out what happens when there's a SFO while
>> a volume move is happening?
>>
>>
>> ________________________________________
>> From: Douglas Siggins <siggins@gmail.com>
>> Sent: Wednesday, August 8, 2018 1:11:31 PM
>> To: Ian Ehrenwald
>> Cc: toasters@teaparty.net
>> Subject: Re: volume move and SFO
>>
>> Ian,
>>
>> Just a suggestion (its been a while but I think this is how I removed the
>> throttle in 9.1):
>> volume move governor*> ?
>> modify *Modify the governor configuration
>> show *Display the governor configuration
>> https://community.netapp.com/t5/Data-ONTAP-Discussions/How-
>> many-vol-move-operations-can-be-active-at-same-time/td-p/129331<
>> https://protect-us.mimecast.com/s/_22xC1wpBwSpRxmOSLY-3g?domain=
>> community.netapp.com>
>>
>> You should be able to move that data in a pretty rapid period of time.
>> I've noticed when upgrading to 9.1 the throttle is definitely more visible
>> -- even when removing the throttle there isn't a noticeable impact.
>>
>> I would suggest against keeping the vol move running during the takeover,
>> if its even possible.
>>
>> On Wed, Aug 8, 2018 at 10:33 AM Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com<
>> mailto:Ian.Ehrenwald@hbgusa.com>> wrote:
>> Good morning
>> I have a four node cluster, nodes1/2 are SAS/SATA and nodes 3/4 are AFF.
>> I have a long running volume move going from node 1 to node 4. Long
>> running, like 30TB+, and it's about 60% done. I need to do some hardware
>> maintenance on nodes 1 and 2 tomorrow evening (install additional
>> FlashCache cards). Will a takeover of node 2 by node 1, then a takeover of
>> node 1 by node 2, interrupt this volume move? I can't seem to find much in
>> the way of documentation about what happens during a SFO and a volume move,
>> but it's possible I'm just not looking hard enough. Thanks for any
>> insights.
>>
>>
>> Ian Ehrenwald
>> Senior Infrastructure Engineer
>> Hachette Book Group, Inc.
>> 1.617.263.1948 / ian.ehrenwald@hbgusa.com<mailto:ian.ehrenwald@hbgusa.com
>> >
>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net<mailto:Toasters@teaparty.net>
>> http://www.teaparty.net/mailman/listinfo/toasters<http
>> s://protect-us.mimecast.com/s/Ii7BC2kq1kfkjWG6C1mJ0Y?domain=teaparty.net>
>>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>

Re: volume move and SFO [ In reply to ]

Ian.Ehrenwald at hbgusa

Aug 8, 2018, 11:29 AM

Post #6 of 10 (1801 views)

Hi Mike
Somewhat redacted output from a few minutes ago:

MyCluster1::> volume move show -instance

Vserver Name: mySvm
Volume Name: aVeryLargeVolume
Actual Completion Time: -
Bytes Remaining: 9.44TB
Destination Aggregate: aggr_ssd_3800g_c1n4
Detailed Status: Transferring data: 20.96TB sent.
Estimated Time of Completion: Thu Aug 09 01:58:40 2018
Managing Node: Clus1-Node1
Percentage Complete: 68%
Move Phase: replicating
Estimated Remaining Duration: 11:45:25
Replication Throughput: 233.9MB/s
Duration of Move: 23:47:43
Source Aggregate: aggr_sas_600g_c1n1
Start Time of Move: Tue Aug 07 14:25:38 2018
Move State: healthy
Is Source Volume Encrypted: false
Encryption Key ID of Source Volume: -
Is Destination Volume Encrypted: false
Encryption Key ID of Destination Volume: -

MyCluster1::>

Depending on source filer load, I've seen Replication Throughput anywhere from 25MB/s through 400MB/s and higher. The source aggregate is 192x600GB SAS on a filer with 2TB FlashCache and it sometimes sees periods of 30K IOPS. I guess the point is ETA has been anywhere from just about now all the way out to this coming Sunday. There's a good chance that the move will complete before tomorrow evening according to these numbers though. If it doesn't complete on time, I guess we'll find out what effect, if any, SFO has (unless I can easily reschedule this HW maintenance).

________________________________________
From: Mike Gossett <cmgossett@gmail.com>
Sent: Wednesday, August 8, 2018 13:59
To: Douglas Siggins
Cc: Ian Ehrenwald; Toasters
Subject: Re: volume move and SFO

How long did it take to get to your 60% point that you reference? Some quick math says 30TB at 400MB/s should complete in about 22 hours. If you're 60 percent done, and you have a day and a half, then it should complete before tomorrow night, right?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

cmgossett at gmail

Aug 8, 2018, 11:37 AM

Post #7 of 10 (1801 views)

Hi Ian,

The good news is that, forgetting about what it is estimating, we’ve seen that in 24 hours 21TB has been copied and. Hopefully another 30 hours or whatever is sufficient for the remaining 9.5TB — thanks for sharing. I’m interested to know the result of the SFO - but if was me I’d try to push the maintenance back to be safe. Or open a ticket with support and see if they can tell you what to expect

> On Aug 8, 2018, at 1:29 PM, Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com> wrote:
>
> Hi Mike
> Somewhat redacted output from a few minutes ago:
>
> MyCluster1::> volume move show -instance
>
> Vserver Name: mySvm
> Volume Name: aVeryLargeVolume
> Actual Completion Time: -
> Bytes Remaining: 9.44TB
> Destination Aggregate: aggr_ssd_3800g_c1n4
> Detailed Status: Transferring data: 20.96TB sent.
> Estimated Time of Completion: Thu Aug 09 01:58:40 2018
> Managing Node: Clus1-Node1
> Percentage Complete: 68%
> Move Phase: replicating
> Estimated Remaining Duration: 11:45:25
> Replication Throughput: 233.9MB/s
> Duration of Move: 23:47:43
> Source Aggregate: aggr_sas_600g_c1n1
> Start Time of Move: Tue Aug 07 14:25:38 2018
> Move State: healthy
> Is Source Volume Encrypted: false
> Encryption Key ID of Source Volume: -
> Is Destination Volume Encrypted: false
> Encryption Key ID of Destination Volume: -
>
> MyCluster1::>
>
>
> Depending on source filer load, I've seen Replication Throughput anywhere from 25MB/s through 400MB/s and higher. The source aggregate is 192x600GB SAS on a filer with 2TB FlashCache and it sometimes sees periods of 30K IOPS. I guess the point is ETA has been anywhere from just about now all the way out to this coming Sunday. There's a good chance that the move will complete before tomorrow evening according to these numbers though. If it doesn't complete on time, I guess we'll find out what effect, if any, SFO has (unless I can easily reschedule this HW maintenance).
>
>
>
> ________________________________________
> From: Mike Gossett <cmgossett@gmail.com>
> Sent: Wednesday, August 8, 2018 13:59
> To: Douglas Siggins
> Cc: Ian Ehrenwald; Toasters
> Subject: Re: volume move and SFO
>
> How long did it take to get to your 60% point that you reference? Some quick math says 30TB at 400MB/s should complete in about 22 hours. If you're 60 percent done, and you have a day and a half, then it should complete before tomorrow night, right?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

Ian.Ehrenwald at hbgusa

Aug 9, 2018, 9:12 AM

Post #8 of 10 (1792 views)

Good afternoon
Update, as promised. The volume move completed early this morning so there will be no conflict with our maintenance tonight. Inline compaction is AWESOME - it has saved us over 16TB on this new SSD aggregate. Crazy stuff.

________________________________________
From: Mike Gossett <cmgossett@gmail.com>
Sent: Wednesday, August 8, 2018 2:37:35 PM
To: Ian Ehrenwald
Cc: Douglas Siggins; Toasters
Subject: Re: volume move and SFO

Hi Ian,

The good news is that, forgetting about what it is estimating, we?ve seen that in 24 hours 21TB has been copied and. Hopefully another 30 hours or whatever is sufficient for the remaining 9.5TB ? thanks for sharing. I?m interested to know the result of the SFO - but if was me I?d try to push the maintenance back to be safe. Or open a ticket with support and see if they can tell you what to expect

> On Aug 8, 2018, at 1:29 PM, Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com> wrote:
>
> Hi Mike
> Somewhat redacted output from a few minutes ago:
>
> MyCluster1::> volume move show -instance
>
> Vserver Name: mySvm
> Volume Name: aVeryLargeVolume
> Actual Completion Time: -
> Bytes Remaining: 9.44TB
> Destination Aggregate: aggr_ssd_3800g_c1n4
> Detailed Status: Transferring data: 20.96TB sent.
> Estimated Time of Completion: Thu Aug 09 01:58:40 2018
> Managing Node: Clus1-Node1
> Percentage Complete: 68%
> Move Phase: replicating
> Estimated Remaining Duration: 11:45:25
> Replication Throughput: 233.9MB/s
> Duration of Move: 23:47:43
> Source Aggregate: aggr_sas_600g_c1n1
> Start Time of Move: Tue Aug 07 14:25:38 2018
> Move State: healthy
> Is Source Volume Encrypted: false
> Encryption Key ID of Source Volume: -
> Is Destination Volume Encrypted: false
> Encryption Key ID of Destination Volume: -
>
> MyCluster1::>
>
>
> Depending on source filer load, I've seen Replication Throughput anywhere from 25MB/s through 400MB/s and higher. The source aggregate is 192x600GB SAS on a filer with 2TB FlashCache and it sometimes sees periods of 30K IOPS. I guess the point is ETA has been anywhere from just about now all the way out to this coming Sunday. There's a good chance that the move will complete before tomorrow evening according to these numbers though. If it doesn't complete on time, I guess we'll find out what effect, if any, SFO has (unless I can easily reschedule this HW maintenance).
>
>
>
> ________________________________________
> From: Mike Gossett <cmgossett@gmail.com>
> Sent: Wednesday, August 8, 2018 13:59
> To: Douglas Siggins
> Cc: Ian Ehrenwald; Toasters
> Subject: Re: volume move and SFO
>
> How long did it take to get to your 60% point that you reference? Some quick math says 30TB at 400MB/s should complete in about 22 hours. If you're 60 percent done, and you have a day and a half, then it should complete before tomorrow night, right?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

Ian.Ehrenwald at hbgusa

Aug 10, 2018, 5:55 AM

Post #9 of 10 (1778 views)

Based on a Justin Parisi article that Douglas pointed out earlier in the thread, I'm going with the assumption that the cluster would veto the giveback if a move was in progress.

I actually just got a new FAS2720 HA pair for my lab this week, so when it's set up and I have a bunch of junk data and volumes on it for testing, I will see what happens if we were to do a move, takeover, and giveback in Real Life (tm). I can't make any promises on an ETA for answer, but as soon as I do it I'll send an update.

________________________________________
From: RUIS Henk <henk.ruis@axians.com>
Sent: Friday, August 10, 2018 4:28:30 AM
To: Ian Ehrenwald; Mike Gossett
Cc: Toasters
Subject: RE: volume move and SFO

Hi,

Great, but now we still don't know if it's possible to move a volume during maintenance ;-(

Met vriendelijke groet / Kind regards,

Henk Ruis
Technical Consultant

-----Oorspronkelijk bericht-----
Van: toasters-bounces@teaparty.net <toasters-bounces@teaparty.net> Namens Ian Ehrenwald
Verzonden: donderdag 9 augustus 2018 18:13
Aan: Mike Gossett <cmgossett@gmail.com>
CC: Toasters <toasters@teaparty.net>
Onderwerp: Re: volume move and SFO

Good afternoon
Update, as promised. The volume move completed early this morning so there will be no conflict with our maintenance tonight. Inline compaction is AWESOME - it has saved us over 16TB on this new SSD aggregate. Crazy stuff.

________________________________________
From: Mike Gossett <cmgossett@gmail.com>
Sent: Wednesday, August 8, 2018 2:37:35 PM
To: Ian Ehrenwald
Cc: Douglas Siggins; Toasters
Subject: Re: volume move and SFO

Hi Ian,

The good news is that, forgetting about what it is estimating, we've seen that in 24 hours 21TB has been copied and. Hopefully another 30 hours or whatever is sufficient for the remaining 9.5TB - thanks for sharing. I'm interested to know the result of the SFO - but if was me I'd try to push the maintenance back to be safe. Or open a ticket with support and see if they can tell you what to expect

> On Aug 8, 2018, at 1:29 PM, Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com> wrote:
>
> Hi Mike
> Somewhat redacted output from a few minutes ago:
>
> MyCluster1::> volume move show -instance
>
> Vserver Name: mySvm
> Volume Name: aVeryLargeVolume
> Actual Completion Time: -
> Bytes Remaining: 9.44TB
> Destination Aggregate: aggr_ssd_3800g_c1n4
> Detailed Status: Transferring data: 20.96TB sent.
> Estimated Time of Completion: Thu Aug 09 01:58:40 2018
> Managing Node: Clus1-Node1
> Percentage Complete: 68%
> Move Phase: replicating
> Estimated Remaining Duration: 11:45:25
> Replication Throughput: 233.9MB/s
> Duration of Move: 23:47:43
> Source Aggregate: aggr_sas_600g_c1n1
> Start Time of Move: Tue Aug 07 14:25:38 2018
> Move State: healthy
> Is Source Volume Encrypted: false
> Encryption Key ID of Source Volume: -
> Is Destination Volume Encrypted: false Encryption Key ID of
> Destination Volume: -
>
> MyCluster1::>
>
>
> Depending on source filer load, I've seen Replication Throughput anywhere from 25MB/s through 400MB/s and higher. The source aggregate is 192x600GB SAS on a filer with 2TB FlashCache and it sometimes sees periods of 30K IOPS. I guess the point is ETA has been anywhere from just about now all the way out to this coming Sunday. There's a good chance that the move will complete before tomorrow evening according to these numbers though. If it doesn't complete on time, I guess we'll find out what effect, if any, SFO has (unless I can easily reschedule this HW maintenance).
>
>
>
> ________________________________________
> From: Mike Gossett <cmgossett@gmail.com>
> Sent: Wednesday, August 8, 2018 13:59
> To: Douglas Siggins
> Cc: Ian Ehrenwald; Toasters
> Subject: Re: volume move and SFO
>
> How long did it take to get to your 60% point that you reference? Some quick math says 30TB at 400MB/s should complete in about 22 hours. If you're 60 percent done, and you have a day and a half, then it should complete before tomorrow night, right?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters<https://protect-us.mimecast.com/s/bF00ClYkMYUo703Os9r2z6?domain=teaparty.net>

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: volume move and SFO [ In reply to ]

fkim at BERKCOM

Aug 11, 2018, 2:59 PM

Post #10 of 10 (1771 views)

Here’s a repost, with a link to the TR rather than the actual pdf.

https://www.netapp.com/us/media/tr-4075.pdf

Francis Kim
Cell: 415-606-2525
Direct: 510-644-1599 x334
fkim@berkcom.com<mailto:fkim@berkcom.com>
www.berkcom.com

On Aug 11, 2018, at 12:02 PM, Francis Kim <fkim@BERKCOM.com<mailto:fkim@BERKCOM.com>> wrote:

Ian,
<tr-4075 DataMotion for Volumes NetApp clustered Data ONTAP 8.2 and 8.3.pdf>
With recurring controller and shelf refreshes, I’ve become a frequent vol mover (and a fan of it) over the last two years. NDO aspect of this task is obviously a big draw.

Attached TR-4075, dated March 2015, has the most detailed explanation of vol moves I’ve been able to find, but its discussion is limited to 8.2 vs 8.3. Nothing about ONTAP 9 in this TR, so YMMV.

In this TR much is made about the two phases of a vol move, the iterative (baseline and updates) and cutover, with respective to what other operations are (in)compatible with the progress of a vol move.

During the iterative phase, a vol move in 8.2 would have to be restarted after a FO/GB while in 8.3 they would resume at its most recent checkpoint.

However, once the cutover phase has been entered, a vol move in 8.2 would survive a FO/GB if it had crossed its “point of no return” checkpoint, while an 8.3 vol move is mutually exclusive with a FO/GB, suggesting a cutover would have to be reattempted afterward.

I’ve not been able to find under 9.1 (even in diag mode) any information specific to these checkpoints. Not sure whether “Bytes sent” is a checkpoint.

Since documentation on vol moves is generally skinny and this TR is now over three years and five releases old, a lab run is probably a good move if you have access to gear.

Francis Kim
Cell: 415-606-2525
Direct: 510-644-1599 x334
fkim@berkcom.com<mailto:fkim@berkcom.com>
www.berkcom.com<http://www.berkcom.com>

On Aug 10, 2018, at 2:55 AM, Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com<mailto:Ian.Ehrenwald@hbgusa.com>> wrote:

Based on a Justin Parisi article that Douglas pointed out earlier in the thread, I'm going with the assumption that the cluster would veto the giveback if a move was in progress.

I actually just got a new FAS2720 HA pair for my lab this week, so when it's set up and I have a bunch of junk data and volumes on it for testing, I will see what happens if we were to do a move, takeover, and giveback in Real Life (tm). I can't make any promises on an ETA for answer, but as soon as I do it I'll send an update.

________________________________________
From: RUIS Henk <henk.ruis@axians.com<mailto:henk.ruis@axians.com>>
Sent: Friday, August 10, 2018 4:28:30 AM
To: Ian Ehrenwald; Mike Gossett
Cc: Toasters
Subject: RE: volume move and SFO

Hi,

Great, but now we still don't know if it's possible to move a volume during maintenance ;-(

Met vriendelijke groet / Kind regards,

Henk Ruis
Technical Consultant

-----Oorspronkelijk bericht-----
Van: toasters-bounces@teaparty.net<mailto:toasters-bounces@teaparty.net> <toasters-bounces@teaparty.net<mailto:toasters-bounces@teaparty.net>> Namens Ian Ehrenwald
Verzonden: donderdag 9 augustus 2018 18:13
Aan: Mike Gossett <cmgossett@gmail.com<mailto:cmgossett@gmail.com>>
CC: Toasters <toasters@teaparty.net<mailto:toasters@teaparty.net>>
Onderwerp: Re: volume move and SFO

Good afternoon
Update, as promised. The volume move completed early this morning so there will be no conflict with our maintenance tonight. Inline compaction is AWESOME - it has saved us over 16TB on this new SSD aggregate. Crazy stuff.

________________________________________
From: Mike Gossett <cmgossett@gmail.com<mailto:cmgossett@gmail.com>>
Sent: Wednesday, August 8, 2018 2:37:35 PM
To: Ian Ehrenwald
Cc: Douglas Siggins; Toasters
Subject: Re: volume move and SFO

Hi Ian,

The good news is that, forgetting about what it is estimating, we've seen that in 24 hours 21TB has been copied and. Hopefully another 30 hours or whatever is sufficient for the remaining 9.5TB - thanks for sharing. I'm interested to know the result of the SFO - but if was me I'd try to push the maintenance back to be safe. Or open a ticket with support and see if they can tell you what to expect

On Aug 8, 2018, at 1:29 PM, Ian Ehrenwald <Ian.Ehrenwald@hbgusa.com<mailto:Ian.Ehrenwald@hbgusa.com>> wrote:

Hi Mike
Somewhat redacted output from a few minutes ago:

MyCluster1::> volume move show -instance

Vserver Name: mySvm
Volume Name: aVeryLargeVolume
Actual Completion Time: -
Bytes Remaining: 9.44TB
Destination Aggregate: aggr_ssd_3800g_c1n4
Detailed Status: Transferring data: 20.96TB sent.
Estimated Time of Completion: Thu Aug 09 01:58:40 2018
Managing Node: Clus1-Node1
Percentage Complete: 68%
Move Phase: replicating
Estimated Remaining Duration: 11:45:25
Replication Throughput: 233.9MB/s
Duration of Move: 23:47:43
Source Aggregate: aggr_sas_600g_c1n1
Start Time of Move: Tue Aug 07 14:25:38 2018
Move State: healthy
Is Source Volume Encrypted: false
Encryption Key ID of Source Volume: -
Is Destination Volume Encrypted: false Encryption Key ID of
Destination Volume: -

MyCluster1::>

Depending on source filer load, I've seen Replication Throughput anywhere from 25MB/s through 400MB/s and higher. The source aggregate is 192x600GB SAS on a filer with 2TB FlashCache and it sometimes sees periods of 30K IOPS. I guess the point is ETA has been anywhere from just about now all the way out to this coming Sunday. There's a good chance that the move will complete before tomorrow evening according to these numbers though. If it doesn't complete on time, I guess we'll find out what effect, if any, SFO has (unless I can easily reschedule this HW maintenance).

________________________________________
From: Mike Gossett <cmgossett@gmail.com<mailto:cmgossett@gmail.com>>
Sent: Wednesday, August 8, 2018 13:59
To: Douglas Siggins
Cc: Ian Ehrenwald; Toasters
Subject: Re: volume move and SFO

How long did it take to get to your 60% point that you reference? Some quick math says 30TB at 400MB/s should complete in about 22 hours. If you're 60 percent done, and you have a day and a half, then it should complete before tomorrow night, right?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters<https://protect-us.mimecast.com/s/bF00ClYkMYUo703Os9r2z6?domain=teaparty.net>

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters