Mailing List Archive

AW: NICs fail to connect after upgrade to OnTap 9.3P21
Hi Brian,

I recall having the same issue a few years ago and for me it was fixed by replacing the SFPs. The firmware on them was not compatible anymore with the newer Ontap Release and/or NIC driver there and the link was always down.
Are you using NetApp official SFPs or are you coding them on your own?

Best,

Alexander Griesser
Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: AGriesser@anexia-it.com
Web: http://www.anexia-it.com

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstra?e 140, 9020 Klagenfurt
Gesch?ftsf?hrer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

-----Urspr?ngliche Nachricht-----
Von: Brian Parent <bparent@ucsd.edu>
Gesendet: Freitag, 26. M?rz 2021 02:23
An: toasters@teaparty.net
Betreff: NICs fail to connect after upgrade to OnTap 9.3P21

I upgraded our AFF8020 (2 nodes) yesterday from 9.1P10 to 9.3P21, as a first step towards 9.7P12.

Unfortunately, the two 10Gb interfaces (e0c, e0d) on the second controller won't connect after the upgrade.
It seems like a layer one problem, with no link lights on either the NetApp, or the switch side.
Yet it seems unlikely that we have two physical paths fail at the same time, and too coincidental that it happened (apparently) during the giveback.
And, reseating the fiber jumpers, as well as the SFP optics hasn't helped.

Has anyone else seen anything like this?

I've engaged (third party) support, but so far, no joy.

I've tried shut/no shut on the switch side, (Cisco 9K) also tried the same sort of thing on the NetApp side (advanced mode: network port modify -node <nodename> -port <portname> -up-admin false, on both ports, then back to true)

Also, the e0c and e0d were bundled into a if_group (a0a) on the NetApp side, and a VPC port channel on the Cisco side, (identically on both nodes, so the config seems fine since it still works on node1.
I tried deleting node2's a0a, but that failed due to a lif having set it's home to that a0a. So, I migrated that lif "permanently" to the working node, and then deleted node2's a0a, and then tried the -up-admin=false/true trick again on the e0c and e0d ports, but still no joy.

I noticed the MTU was set to 9216 on the Cisco side and 1500 on the NetApp side, but again, this is identical on both nodes, and is working on node1, so even though it looks suspicious, it's probably not the problem. I tried setting the NetApp side to 9000 (it didn't accept 9216), but that didn't help either.

I thinking my next step is to delete node1's a0a, and remove the corresponding port channels on the Cisco side, then try swapping known good fibers and optics to the problem ports to which piece is the root cause.

What does the toasters brain trust say to all this?

--
Brian Parent
Information Technology Services Department ITS Computing Infrastructure Operations Group its-ci-ops-help@ucsd.edu (team email address for Service Now) UC San Diego
(858) 534-6090

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters