Mailing List Archive

Automated vs Manual NDU
Toasters,

What's your preference for non-disruptively upgrading a switch based
ONTAP 9 cluster - automated NDU or manual (rolling) NDU?

Happy to hear of both positive and negative experiences, if any.

The cluster in question consists of 3 HA pairs so the automated
upgrade will default to rolling. The general recommendation is to use
the automated procedure but there are concerns about lack of control,
especially in the event of issues. Each HA pair in the cluster hosts
critical prod workloads.

No access to a test cluster so there isn't much opportunity to build
confidence in the automated procedure ahead of time. I am aware of
the ability to pause the automated upgrade.

Leaning toward manual at the moment due to lack of exposure to the
automated process.

Cheers,
Phil
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
Re: Automated vs Manual NDU [ In reply to ]
Thanks for the info. I'm familiar with the vetoed giveback due to
CIFS - we hit that during unplanned failover events. Good to know I
can expect that during upgrades as well.

Are you initiating the upgrade from the GUI? Also, when you override
the CIFS veto, do you then need to issue a "cluster image
resume-update" or resume from the GUI somewhere?

On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
>
> Really like the automated myself. So much better than the old 7-mode days.
>
> Only issue I repeatedly hit is on giveback, aggr giveback will get vetoed due to CIFS sessions. Never understood why it's fine to break CIFS sessions on takeover, but everything comes to a halt on giveback.
>
> Have to go to CLI and force aggr giveback with override-veto switch.
>
> Philbert Rupkins <philbertrupkins@gmail.com> wrote:
>
> Toasters,
>
> What's your preference for non-disruptively upgrading a switch based
> ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
>
> Happy to hear of both positive and negative experiences, if any.
>
> The cluster in question consists of 3 HA pairs so the automated
> upgrade will default to rolling. The general recommendation is to use
> the automated procedure but there are concerns about lack of control,
> especially in the event of issues. Each HA pair in the cluster hosts
> critical prod workloads.
>
> No access to a test cluster so there isn't much opportunity to build
> confidence in the automated procedure ahead of time. I am aware of
> the ability to pause the automated upgrade.
>
> Leaning toward manual at the moment due to lack of exposure to the
> automated process.
>
> Cheers,
> Phil
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
Re: Automated vs Manual NDU [ In reply to ]
It depends ????
If you catch it early enough, the automated process seemingly continuously
checks the process and pauses until the giveback completes.
I missed it one time for an hour (went to lunch). It took between 3-10
minutes but it detected and continued automagically.

Anymore these days, I try to kick it off from the GUI. It is just too easy
not to. Not like the old days with the hours-worth of manually checks and
the process to upgrade which took 1-2 hours depending on node count.

I have not seen it, but I think there is a check in the GUI to continue if
something odd happens. Cannot attest to what ODD may be as I have not
personally seen anything.

--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*


On Thu, Jul 9, 2020 at 5:20 PM Philbert Rupkins <philbertrupkins@gmail.com>
wrote:

> Thanks for the info. I'm familiar with the vetoed giveback due to
> CIFS - we hit that during unplanned failover events. Good to know I
> can expect that during upgrades as well.
>
> Are you initiating the upgrade from the GUI? Also, when you override
> the CIFS veto, do you then need to issue a "cluster image
> resume-update" or resume from the GUI somewhere?
>
> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
> >
> > Really like the automated myself. So much better than the old 7-mode
> days.
> >
> > Only issue I repeatedly hit is on giveback, aggr giveback will get
> vetoed due to CIFS sessions. Never understood why it's fine to break CIFS
> sessions on takeover, but everything comes to a halt on giveback.
> >
> > Have to go to CLI and force aggr giveback with override-veto switch.
> >
> > Philbert Rupkins <philbertrupkins@gmail.com> wrote:
> >
> > Toasters,
> >
> > What's your preference for non-disruptively upgrading a switch based
> > ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
> >
> > Happy to hear of both positive and negative experiences, if any.
> >
> > The cluster in question consists of 3 HA pairs so the automated
> > upgrade will default to rolling. The general recommendation is to use
> > the automated procedure but there are concerns about lack of control,
> > especially in the event of issues. Each HA pair in the cluster hosts
> > critical prod workloads.
> >
> > No access to a test cluster so there isn't much opportunity to build
> > confidence in the automated procedure ahead of time. I am aware of
> > the ability to pause the automated upgrade.
> >
> > Leaning toward manual at the moment due to lack of exposure to the
> > automated process.
> >
> > Cheers,
> > Phil
> > _______________________________________________
> > Toasters mailing list
> > Toasters@teaparty.net
> > https://www.teaparty.net/mailman/listinfo/toasters
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
>
Re: Automated vs Manual NDU [ In reply to ]
>>>>> "Philbert" == Philbert Rupkins <philbertrupkins@gmail.com> writes:

Philbert> Thanks for the info. I'm familiar with the vetoed giveback
Philbert> due to CIFS - we hit that during unplanned failover events.
Philbert> Good to know I can expect that during upgrades as well.

I did an upgrade (see my questions from Jan/Feb time) of 8.3 to 9.3
going through 9.1 and it was smooth sailing from the CLI. Super nice
and easy. I really liked how well the upgrade process works now as
compared to the old 8.1 -> 8.3 cDOT upgrade I did, as well as other
7-mode upgrades in the past.

I'm a CLI guy (heh, nearly wrote gui there) so I just do it from a
screen session inside xterm and keep alot of history. We did a big
ESX hardware upgrade at the same time, so all my main production loads
were shutdown, but honestly, OnTap is so rock solid for regular NFS
and even CIFS loads that I'd be ballsy and just go for it.

It all depends on your management's comfort level. Can you show them
that failovers and pretty much transparent today to give them
confidence?

Which brings me to my big rant, which is failure testing. Too many
sites/people are scared to do testing, or make any changes. If you
have a robust system, which you expect to be HA, then you need to
*test* it to be sure, and to make sure you know the right proceedures
in case of problems.

Otherwise, you don't know and can't trust your setup. Which is why I
really love the Netflix Simian Army stuff. I just wish I could get
more of the team I work with to understand this idea. Test for
failures under realistic conditions or you won't know.


Philbert> Are you initiating the upgrade from the GUI? Also, when you
Philbert> override the CIFS veto, do you then need to issue a "cluster
Philbert> image resume-update" or resume from the GUI somewhere?

Philbert> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
>>
>> Really like the automated myself. So much better than the old 7-mode days.
>>
>> Only issue I repeatedly hit is on giveback, aggr giveback will get vetoed due to CIFS sessions. Never understood why it's fine to break CIFS sessions on takeover, but everything comes to a halt on giveback.
>>
>> Have to go to CLI and force aggr giveback with override-veto switch.
>>
>> Philbert Rupkins <philbertrupkins@gmail.com> wrote:
>>
>> Toasters,
>>
>> What's your preference for non-disruptively upgrading a switch based
>> ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
>>
>> Happy to hear of both positive and negative experiences, if any.
>>
>> The cluster in question consists of 3 HA pairs so the automated
>> upgrade will default to rolling. The general recommendation is to use
>> the automated procedure but there are concerns about lack of control,
>> especially in the event of issues. Each HA pair in the cluster hosts
>> critical prod workloads.
>>
>> No access to a test cluster so there isn't much opportunity to build
>> confidence in the automated procedure ahead of time. I am aware of
>> the ability to pause the automated upgrade.
>>
>> Leaning toward manual at the moment due to lack of exposure to the
>> automated process.
>>
>> Cheers,
>> Phil
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> https://www.teaparty.net/mailman/listinfo/toasters
Philbert> _______________________________________________
Philbert> Toasters mailing list
Philbert> Toasters@teaparty.net
Philbert> https://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
Re: Automated vs Manual NDU [ In reply to ]
"It all depends on your management's comfort level. Can you show them
that failovers and pretty much transparent today to give them
confidence?"

I can, and the few unexpected failovers we've experienced have been
smooth. So that is a confidence boost. However, these particular
systems host some of the most critical workloads for the business so
there is always hesitation when it's time to upgrade or make a
significant change.

Regarding your rant (as you put it), we should absolutely get into the
habit of failure testing. I can give you 7,000 excuses as to why we
don't but it essentially boils down to priorities. I need to start
pushing this idea again.

Also, massive thanks for pointing me to your previous upgrade related
questions. The guidance to upgrade disk/shelf/sp/bmc ahead of the
upgrade reinforces current plans.

Speaking of which, is there a copy of the SP and BMC Firmware / ONTAP
Support Matrix that still lists ONTAP 9.2? We're upgrading from 9.2
to 9.3 and I'd like to upgrade the cluster BMC and SP firmware to the
versions bundled with ONTAP 9.3 ahead of time. However, the matrix
does not contain compatibility info for 9.2 so I'm left to guess.
Link below for reference.

https://mysupport.netapp.com/NOW/download/tools/serviceimage/support/ServiceProcessorSupportMatrix.shtml

ONTAP 9.2 is on limited support until the end of the month so it was a
bit surprising to see it missing from the firmware matrix.

On Fri, Jul 10, 2020 at 8:05 PM John Stoffel <john@stoffel.org> wrote:
>
> >>>>> "Philbert" == Philbert Rupkins <philbertrupkins@gmail.com> writes:
>
> Philbert> Thanks for the info. I'm familiar with the vetoed giveback
> Philbert> due to CIFS - we hit that during unplanned failover events.
> Philbert> Good to know I can expect that during upgrades as well.
>
> I did an upgrade (see my questions from Jan/Feb time) of 8.3 to 9.3
> going through 9.1 and it was smooth sailing from the CLI. Super nice
> and easy. I really liked how well the upgrade process works now as
> compared to the old 8.1 -> 8.3 cDOT upgrade I did, as well as other
> 7-mode upgrades in the past.
>
> I'm a CLI guy (heh, nearly wrote gui there) so I just do it from a
> screen session inside xterm and keep alot of history. We did a big
> ESX hardware upgrade at the same time, so all my main production loads
> were shutdown, but honestly, OnTap is so rock solid for regular NFS
> and even CIFS loads that I'd be ballsy and just go for it.
>
> It all depends on your management's comfort level. Can you show them
> that failovers and pretty much transparent today to give them
> confidence?
>
> Which brings me to my big rant, which is failure testing. Too many
> sites/people are scared to do testing, or make any changes. If you
> have a robust system, which you expect to be HA, then you need to
> *test* it to be sure, and to make sure you know the right proceedures
> in case of problems.
>
> Otherwise, you don't know and can't trust your setup. Which is why I
> really love the Netflix Simian Army stuff. I just wish I could get
> more of the team I work with to understand this idea. Test for
> failures under realistic conditions or you won't know.
>
>
> Philbert> Are you initiating the upgrade from the GUI? Also, when you
> Philbert> override the CIFS veto, do you then need to issue a "cluster
> Philbert> image resume-update" or resume from the GUI somewhere?
>
> Philbert> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
> >>
> >> Really like the automated myself. So much better than the old 7-mode days.
> >>
> >> Only issue I repeatedly hit is on giveback, aggr giveback will get vetoed due to CIFS sessions. Never understood why it's fine to break CIFS sessions on takeover, but everything comes to a halt on giveback.
> >>
> >> Have to go to CLI and force aggr giveback with override-veto switch.
> >>
> >> Philbert Rupkins <philbertrupkins@gmail.com> wrote:
> >>
> >> Toasters,
> >>
> >> What's your preference for non-disruptively upgrading a switch based
> >> ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
> >>
> >> Happy to hear of both positive and negative experiences, if any.
> >>
> >> The cluster in question consists of 3 HA pairs so the automated
> >> upgrade will default to rolling. The general recommendation is to use
> >> the automated procedure but there are concerns about lack of control,
> >> especially in the event of issues. Each HA pair in the cluster hosts
> >> critical prod workloads.
> >>
> >> No access to a test cluster so there isn't much opportunity to build
> >> confidence in the automated procedure ahead of time. I am aware of
> >> the ability to pause the automated upgrade.
> >>
> >> Leaning toward manual at the moment due to lack of exposure to the
> >> automated process.
> >>
> >> Cheers,
> >> Phil
> >> _______________________________________________
> >> Toasters mailing list
> >> Toasters@teaparty.net
> >> https://www.teaparty.net/mailman/listinfo/toasters
> Philbert> _______________________________________________
> Philbert> Toasters mailing list
> Philbert> Toasters@teaparty.net
> Philbert> https://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
Re: Automated vs Manual NDU [ In reply to ]
Let the SP/BMC happen organically!
Meaning: ONTAP includes the appropriate release for the version.
You cannot "upgrade" the SP/BMC to a newer version if ONTAP does not
support it. ONTAP will not let you. You may be able to "force" it, but why
bother?
It will be upgraded to the (usually) most compatible release after the
upgrade.
Most of the time, I see:
ONTAP: Node 1 upgrade done. Node 2 working on upgrade. While that happens
the SP on Node 1 reboots due to automatic upgrade. After Node 2 boots the
new ONTAP, it will eventually upgrade its' SP/BMC firmware also.

The best advice anyone can give here: REBOOT YOUR SP/BMC before any
upgrade. It makes the FW upgrade A LOT easier!

Also, from what I can tell, if you upgrade to ONTAP 9.3P18 (or newer as the
case may be) you should get the most BMC/SP version automatically upgraded.

Disks/Shelves: Install the files AFTER the upgrade. There are a few edge
cases these days (like an IOM12 firmware) where it will not upgrade until
ONTAP has a certain patch installed. Plus, the upgrade should include the
latest Disk/Shelf firmware files when the ONTP release was built.

The one item that is not in any ONTAP release is the Disk Qualification
package. You can install/update that any time. Just be sure to read any
warnings on the download page for it@

--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*



On Sat, Jul 11, 2020 at 11:54 PM Philbert Rupkins <philbertrupkins@gmail.com>
wrote:

> "It all depends on your management's comfort level. Can you show them
> that failovers and pretty much transparent today to give them
> confidence?"
>
> I can, and the few unexpected failovers we've experienced have been
> smooth. So that is a confidence boost. However, these particular
> systems host some of the most critical workloads for the business so
> there is always hesitation when it's time to upgrade or make a
> significant change.
>
> Regarding your rant (as you put it), we should absolutely get into the
> habit of failure testing. I can give you 7,000 excuses as to why we
> don't but it essentially boils down to priorities. I need to start
> pushing this idea again.
>
> Also, massive thanks for pointing me to your previous upgrade related
> questions. The guidance to upgrade disk/shelf/sp/bmc ahead of the
> upgrade reinforces current plans.
>
> Speaking of which, is there a copy of the SP and BMC Firmware / ONTAP
> Support Matrix that still lists ONTAP 9.2? We're upgrading from 9.2
> to 9.3 and I'd like to upgrade the cluster BMC and SP firmware to the
> versions bundled with ONTAP 9.3 ahead of time. However, the matrix
> does not contain compatibility info for 9.2 so I'm left to guess.
> Link below for reference.
>
>
> https://mysupport.netapp.com/NOW/download/tools/serviceimage/support/ServiceProcessorSupportMatrix.shtml
>
> ONTAP 9.2 is on limited support until the end of the month so it was a
> bit surprising to see it missing from the firmware matrix.
>
> On Fri, Jul 10, 2020 at 8:05 PM John Stoffel <john@stoffel.org> wrote:
> >
> > >>>>> "Philbert" == Philbert Rupkins <philbertrupkins@gmail.com> writes:
> >
> > Philbert> Thanks for the info. I'm familiar with the vetoed giveback
> > Philbert> due to CIFS - we hit that during unplanned failover events.
> > Philbert> Good to know I can expect that during upgrades as well.
> >
> > I did an upgrade (see my questions from Jan/Feb time) of 8.3 to 9.3
> > going through 9.1 and it was smooth sailing from the CLI. Super nice
> > and easy. I really liked how well the upgrade process works now as
> > compared to the old 8.1 -> 8.3 cDOT upgrade I did, as well as other
> > 7-mode upgrades in the past.
> >
> > I'm a CLI guy (heh, nearly wrote gui there) so I just do it from a
> > screen session inside xterm and keep alot of history. We did a big
> > ESX hardware upgrade at the same time, so all my main production loads
> > were shutdown, but honestly, OnTap is so rock solid for regular NFS
> > and even CIFS loads that I'd be ballsy and just go for it.
> >
> > It all depends on your management's comfort level. Can you show them
> > that failovers and pretty much transparent today to give them
> > confidence?
> >
> > Which brings me to my big rant, which is failure testing. Too many
> > sites/people are scared to do testing, or make any changes. If you
> > have a robust system, which you expect to be HA, then you need to
> > *test* it to be sure, and to make sure you know the right proceedures
> > in case of problems.
> >
> > Otherwise, you don't know and can't trust your setup. Which is why I
> > really love the Netflix Simian Army stuff. I just wish I could get
> > more of the team I work with to understand this idea. Test for
> > failures under realistic conditions or you won't know.
> >
> >
> > Philbert> Are you initiating the upgrade from the GUI? Also, when you
> > Philbert> override the CIFS veto, do you then need to issue a "cluster
> > Philbert> image resume-update" or resume from the GUI somewhere?
> >
> > Philbert> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
> > >>
> > >> Really like the automated myself. So much better than the old 7-mode
> days.
> > >>
> > >> Only issue I repeatedly hit is on giveback, aggr giveback will get
> vetoed due to CIFS sessions. Never understood why it's fine to break CIFS
> sessions on takeover, but everything comes to a halt on giveback.
> > >>
> > >> Have to go to CLI and force aggr giveback with override-veto switch.
> > >>
> > >> Philbert Rupkins <philbertrupkins@gmail.com> wrote:
> > >>
> > >> Toasters,
> > >>
> > >> What's your preference for non-disruptively upgrading a switch based
> > >> ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
> > >>
> > >> Happy to hear of both positive and negative experiences, if any.
> > >>
> > >> The cluster in question consists of 3 HA pairs so the automated
> > >> upgrade will default to rolling. The general recommendation is to use
> > >> the automated procedure but there are concerns about lack of control,
> > >> especially in the event of issues. Each HA pair in the cluster hosts
> > >> critical prod workloads.
> > >>
> > >> No access to a test cluster so there isn't much opportunity to build
> > >> confidence in the automated procedure ahead of time. I am aware of
> > >> the ability to pause the automated upgrade.
> > >>
> > >> Leaning toward manual at the moment due to lack of exposure to the
> > >> automated process.
> > >>
> > >> Cheers,
> > >> Phil
> > >> _______________________________________________
> > >> Toasters mailing list
> > >> Toasters@teaparty.net
> > >> https://www.teaparty.net/mailman/listinfo/toasters
> > Philbert> _______________________________________________
> > Philbert> Toasters mailing list
> > Philbert> Toasters@teaparty.net
> > Philbert> https://www.teaparty.net/mailman/listinfo/toasters
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
>
Re: Automated vs Manual NDU [ In reply to ]
Aside from the possibility that Disk/Shelf FW updates wont autoupdate
until at a specific ONTAP patch level, is there any harm in updating
the Disk/Shelf FW beforehand? I like the idea of getting as much
done ahead of the ONTAP upgrade to reduce moving parts and length of
the maintenance window.



On Sun, Jul 12, 2020 at 4:33 PM tmac <tmacmd@gmail.com> wrote:
>
> Let the SP/BMC happen organically!
> Meaning: ONTAP includes the appropriate release for the version.
> You cannot "upgrade" the SP/BMC to a newer version if ONTAP does not support it. ONTAP will not let you. You may be able to "force" it, but why bother?
> It will be upgraded to the (usually) most compatible release after the upgrade.
> Most of the time, I see:
> ONTAP: Node 1 upgrade done. Node 2 working on upgrade. While that happens the SP on Node 1 reboots due to automatic upgrade. After Node 2 boots the new ONTAP, it will eventually upgrade its' SP/BMC firmware also.
>
> The best advice anyone can give here: REBOOT YOUR SP/BMC before any upgrade. It makes the FW upgrade A LOT easier!
>
> Also, from what I can tell, if you upgrade to ONTAP 9.3P18 (or newer as the case may be) you should get the most BMC/SP version automatically upgraded.
>
> Disks/Shelves: Install the files AFTER the upgrade. There are a few edge cases these days (like an IOM12 firmware) where it will not upgrade until ONTAP has a certain patch installed. Plus, the upgrade should include the latest Disk/Shelf firmware files when the ONTP release was built.
>
> The one item that is not in any ONTAP release is the Disk Qualification package. You can install/update that any time. Just be sure to read any warnings on the download page for it@
>
> --tmac
>
> Tim McCarthy, Principal Consultant
>
> Proud Member of the #NetAppATeam
>
> I Blog at TMACsRack
>
>
>
>
> On Sat, Jul 11, 2020 at 11:54 PM Philbert Rupkins <philbertrupkins@gmail.com> wrote:
>>
>> "It all depends on your management's comfort level. Can you show them
>> that failovers and pretty much transparent today to give them
>> confidence?"
>>
>> I can, and the few unexpected failovers we've experienced have been
>> smooth. So that is a confidence boost. However, these particular
>> systems host some of the most critical workloads for the business so
>> there is always hesitation when it's time to upgrade or make a
>> significant change.
>>
>> Regarding your rant (as you put it), we should absolutely get into the
>> habit of failure testing. I can give you 7,000 excuses as to why we
>> don't but it essentially boils down to priorities. I need to start
>> pushing this idea again.
>>
>> Also, massive thanks for pointing me to your previous upgrade related
>> questions. The guidance to upgrade disk/shelf/sp/bmc ahead of the
>> upgrade reinforces current plans.
>>
>> Speaking of which, is there a copy of the SP and BMC Firmware / ONTAP
>> Support Matrix that still lists ONTAP 9.2? We're upgrading from 9.2
>> to 9.3 and I'd like to upgrade the cluster BMC and SP firmware to the
>> versions bundled with ONTAP 9.3 ahead of time. However, the matrix
>> does not contain compatibility info for 9.2 so I'm left to guess.
>> Link below for reference.
>>
>> https://mysupport.netapp.com/NOW/download/tools/serviceimage/support/ServiceProcessorSupportMatrix.shtml
>>
>> ONTAP 9.2 is on limited support until the end of the month so it was a
>> bit surprising to see it missing from the firmware matrix.
>>
>> On Fri, Jul 10, 2020 at 8:05 PM John Stoffel <john@stoffel.org> wrote:
>> >
>> > >>>>> "Philbert" == Philbert Rupkins <philbertrupkins@gmail.com> writes:
>> >
>> > Philbert> Thanks for the info. I'm familiar with the vetoed giveback
>> > Philbert> due to CIFS - we hit that during unplanned failover events.
>> > Philbert> Good to know I can expect that during upgrades as well.
>> >
>> > I did an upgrade (see my questions from Jan/Feb time) of 8.3 to 9.3
>> > going through 9.1 and it was smooth sailing from the CLI. Super nice
>> > and easy. I really liked how well the upgrade process works now as
>> > compared to the old 8.1 -> 8.3 cDOT upgrade I did, as well as other
>> > 7-mode upgrades in the past.
>> >
>> > I'm a CLI guy (heh, nearly wrote gui there) so I just do it from a
>> > screen session inside xterm and keep alot of history. We did a big
>> > ESX hardware upgrade at the same time, so all my main production loads
>> > were shutdown, but honestly, OnTap is so rock solid for regular NFS
>> > and even CIFS loads that I'd be ballsy and just go for it.
>> >
>> > It all depends on your management's comfort level. Can you show them
>> > that failovers and pretty much transparent today to give them
>> > confidence?
>> >
>> > Which brings me to my big rant, which is failure testing. Too many
>> > sites/people are scared to do testing, or make any changes. If you
>> > have a robust system, which you expect to be HA, then you need to
>> > *test* it to be sure, and to make sure you know the right proceedures
>> > in case of problems.
>> >
>> > Otherwise, you don't know and can't trust your setup. Which is why I
>> > really love the Netflix Simian Army stuff. I just wish I could get
>> > more of the team I work with to understand this idea. Test for
>> > failures under realistic conditions or you won't know.
>> >
>> >
>> > Philbert> Are you initiating the upgrade from the GUI? Also, when you
>> > Philbert> override the CIFS veto, do you then need to issue a "cluster
>> > Philbert> image resume-update" or resume from the GUI somewhere?
>> >
>> > Philbert> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno <cse@hey.com> wrote:
>> > >>
>> > >> Really like the automated myself. So much better than the old 7-mode days.
>> > >>
>> > >> Only issue I repeatedly hit is on giveback, aggr giveback will get vetoed due to CIFS sessions. Never understood why it's fine to break CIFS sessions on takeover, but everything comes to a halt on giveback.
>> > >>
>> > >> Have to go to CLI and force aggr giveback with override-veto switch.
>> > >>
>> > >> Philbert Rupkins <philbertrupkins@gmail.com> wrote:
>> > >>
>> > >> Toasters,
>> > >>
>> > >> What's your preference for non-disruptively upgrading a switch based
>> > >> ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
>> > >>
>> > >> Happy to hear of both positive and negative experiences, if any.
>> > >>
>> > >> The cluster in question consists of 3 HA pairs so the automated
>> > >> upgrade will default to rolling. The general recommendation is to use
>> > >> the automated procedure but there are concerns about lack of control,
>> > >> especially in the event of issues. Each HA pair in the cluster hosts
>> > >> critical prod workloads.
>> > >>
>> > >> No access to a test cluster so there isn't much opportunity to build
>> > >> confidence in the automated procedure ahead of time. I am aware of
>> > >> the ability to pause the automated upgrade.
>> > >>
>> > >> Leaning toward manual at the moment due to lack of exposure to the
>> > >> automated process.
>> > >>
>> > >> Cheers,
>> > >> Phil
>> > >> _______________________________________________
>> > >> Toasters mailing list
>> > >> Toasters@teaparty.net
>> > >> https://www.teaparty.net/mailman/listinfo/toasters
>> > Philbert> _______________________________________________
>> > Philbert> Toasters mailing list
>> > Philbert> Toasters@teaparty.net
>> > Philbert> https://www.teaparty.net/mailman/listinfo/toasters
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> https://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters