Mailing List Archive

Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4
Hello,

I have upgraded our two node cluster from 9.3P7 to 9.4P2, because we got a new AFF system, which needed version 9.4.

Since the upgrade, I have the problem, that I'm not able any more to revert one cluster lif on both nodes. For both nodes, it is the same cluster lif!
Here our cluster lifs (already with the new AFF system):
network interface show -vserver Cluster
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
aff-01_clus1
up/up 169.254.45.231/16 aff-01 e0a true
aff-01_clus2
up/up 169.254.43.102/16 aff-01 e0b true
aff-02_clus1
up/up 169.254.173.54/16 aff-02 e0a true
aff-02_clus2
up/up 169.254.163.234/16 aff-02 e0b true
fas-01_clus1
up/up 169.254.4.144/16 fas-01 e0c false
fas-01_clus2
up/up 169.254.130.246/16 fas-01 e0c true
fas-01_clus3
up/up 169.254.131.229/16 fas-01 e0b true
fas-01_clus4
up/up 169.254.168.120/16 fas-01 e0d true
fas-02_clus1
up/up 169.254.89.228/16 fas-02 e0b false
fas-02_clus2
up/up 169.254.106.140/16 fas-02 e0c true
fas-02_clus3
up/up 169.254.26.197/16 fas-02 e0b true
fas-02_clus4
up/up 169.254.4.232/16 fas-02 e0d true

As you can see here, both lifs fas-01_clus1 and fas-02_clus1 are not at home.
When I try to revert them or migrate to their home port, I get the same error. Migrating the lif to a different port is possible without any problems:
network interface revert -vserver Cluster -lif fas-01_clus1
Error: command failed: LIF "fas-01_clus1" failed to migrate: failed to move cluster/node-mgmt LIF.

network interface migrate -vserver Cluster -lif fas-01_clus1 -destination-node fas-01 -destination-port e0a
Error: command failed: LIF "fas-01_clus1" failed to migrate: failed to move cluster/node-mgmt LIF.

I have checked the physical port e0a and haven't found any issues and it was working nicely before... I even made a cluster switch upgrade two weeks before and I was able to revert all ports.
I have disabled the port e0a, disabled the switch port and checked cable. Nothing changed.

Here the port config:
network port show -node fas-01 -port e0a -instance

Node: fas-01
Port: e0a
Link: up
MTU: 9000
Auto-Negotiation Administrative: true
Auto-Negotiation Operational: true
Duplex Mode Administrative: auto
Duplex Mode Operational: full
Speed Administrative: auto
Speed Operational: 10000
Flow Control Administrative: none
Flow Control Operational: none
MAC Address: 00:a0:98:3a:52:13
Port Type: physical
Interface Group Parent Node: -
Interface Group Parent Port: -
Distribution Function: -
Create Policy: -
Parent VLAN Node: -
Parent VLAN Port: -
VLAN Tag: -
Remote Device ID: sw1
IPspace Name: Cluster
Broadcast Domain: Cluster
MTU Administrative: 9000
Port Health Status: healthy
Ignore Port Health Status: false
Port Health Degraded Reasons: -


network port show -node fas-01 -port e0c -instance

Node: fas-01
Port: e0c
Link: up
MTU: 9000
Auto-Negotiation Administrative: true
Auto-Negotiation Operational: true
Duplex Mode Administrative: auto
Duplex Mode Operational: full
Speed Administrative: auto
Speed Operational: 10000
Flow Control Administrative: none
Flow Control Operational: none
MAC Address: 00:a0:98:3a:52:15
Port Type: physical
Interface Group Parent Node: -
Interface Group Parent Port: -
Distribution Function: -
Create Policy: -
Parent VLAN Node: -
Parent VLAN Port: -
VLAN Tag: -
Remote Device ID: sw2
IPspace Name: Cluster
Broadcast Domain: Cluster
MTU Administrative: 9000
Port Health Status: healthy
Ignore Port Health Status: false
Port Health Degraded Reasons: -

Port e0c is working without any problems.

I made now some more tests and I found a strange issue with the cluster ping command.
Nodes fas-02, aff-01 and aff-02 behave the same, all of them get the correct cluster addresses from network interface table, only fas-01 doesn't get any addresses here:
Example with fas-02:
cluster ping-cluster -node fas-02
Host is fas-02
Getting addresses from network interface table...
Cluster aff-01_clus1 169.254.45.231 aff-01 e0a
Cluster aff-01_clus2 169.254.43.102 aff-01 e0b
Cluster aff-02_clus1 169.254.173.54 aff-02 e0a
Cluster aff-02_clus2 169.254.163.234 aff-02 e0b
Cluster fas-01_clus1 169.254.4.144 fas-01 e0c
Cluster fas-01_clus2 169.254.130.246 fas-01 e0c
Cluster fas-01_clus3 169.254.131.229 fas-01 e0b
Cluster fas-01_clus4 169.254.168.120 fas-01 e0d
Cluster fas-02_clus1 169.254.89.228 fas-02 e0c
Cluster fas-02_clus2 169.254.106.140 fas-02 e0c
Cluster fas-02_clus3 169.254.26.197 fas-02 e0b
Cluster fas-02_clus4 169.254.4.232 fas-02 e0d
Local = 169.254.89.228 169.254.106.140 169.254.26.197 169.254.4.232
Remote = 169.254.45.231 169.254.43.102 169.254.173.54 169.254.163.234 169.254.4.144 169.254.130.246 169.254.131.229 169.254.168.120
Cluster Vserver Id = 4294967293
Ping status:
................................
Basic connectivity succeeds on 32 path(s)
Basic connectivity fails on 0 path(s)

Here the example from fas-01:
cluster ping-cluster -node fas-01
Host is fas-01
Getting addresses from network interface table...
Getting addresses from sitelist...
Local = 169.254.4.144 169.254.130.246
Remote = 169.254.89.228 169.254.106.140 169.254.45.231 169.254.43.102 169.254.173.54 169.254.163.234
Cluster Vserver Id = 4294967293
Ping status:
............
Basic connectivity succeeds on 12 path(s)
Basic connectivity fails on 0 path(s)

As you see here, fas-01 is missing a lot of addresses here and wasn't able to them from network interface table at all.
When I explicit enable -use-sitelist true , I see the same behavior on all nodes, even on the new ones...

Has someone from you ever had such an issue? My problem is, that the FAS system is not in support anymore and I can't create a ticket at NetApp.
Hopefully someone can help here?!

Best Regards
Florian Schmid
Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4 [ In reply to ]
Have you tried running the Config Advisor tool against your system to
see if it can find any mis-configurations? It's my best suggestion
since we haven't made the leap to 9.x yet, management is too
conservative.

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4 [ In reply to ]
Good morning John,

yes, I ran Config Advisor tool with latest version, but nothing special is wrong there with the configuration.
Only thing is, that it is not best practice, which ports we are using for cluster interconnect, or to be more exact:
We are using e0a to e0d, but not in the right order which port should connect to which switch.

BR Florian

----- Ursprüngliche Mail -----
Von: "John Stoffel" <john@stoffel.org>
An: "Florian Schmid" <fschmid@ubimet.com>
CC: "toasters" <toasters@teaparty.net>
Gesendet: Mittwoch, 17. Oktober 2018 21:10:01
Betreff: Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4

Have you tried running the Config Advisor tool against your system to
see if it can find any mis-configurations? It's my best suggestion
since we haven't made the leap to 9.x yet, management is too
conservative.

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4 [ In reply to ]
Hello all,

I want to inform you, that a reboot of both affected nodes, one at a time, solved that issue.
All lifs are now back at their home-port and be migrated and reverted as often as I want...

Thank you for help!

BR Florian


----- Ursprüngliche Mail -----
Von: "Florian Schmid" <fschmid@ubimet.com>
An: "John Stoffel" <john@stoffel.org>
CC: "toasters" <toasters@teaparty.net>
Gesendet: Donnerstag, 18. Oktober 2018 08:29:06
Betreff: Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4

Good morning John,

yes, I ran Config Advisor tool with latest version, but nothing special is wrong there with the configuration.
Only thing is, that it is not best practice, which ports we are using for cluster interconnect, or to be more exact:
We are using e0a to e0d, but not in the right order which port should connect to which switch.

BR Florian

----- Ursprüngliche Mail -----
Von: "John Stoffel" <john@stoffel.org>
An: "Florian Schmid" <fschmid@ubimet.com>
CC: "toasters" <toasters@teaparty.net>
Gesendet: Mittwoch, 17. Oktober 2018 21:10:01
Betreff: Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4

Have you tried running the Config Advisor tool against your system to
see if it can find any mis-configurations? It's my best suggestion
since we haven't made the leap to 9.x yet, management is too
conservative.

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4 [ In reply to ]
Good to hear! You must have gotten into a wierd state not caught by
Netapp's internal testing. Glad it was this easy to fix.

Florian> I want to inform you, that a reboot of both affected nodes,
Florian> one at a time, solved that issue. All lifs are now back at
Florian> their home-port and be migrated and reverted as often as I
Florian> want...

Florian> Thank you for help!

Florian> BR Florian


Florian> ----- Urspr?ngliche Mail -----
Florian> Von: "Florian Schmid" <fschmid@ubimet.com>
Florian> An: "John Stoffel" <john@stoffel.org>
Florian> CC: "toasters" <toasters@teaparty.net>
Florian> Gesendet: Donnerstag, 18. Oktober 2018 08:29:06
Florian> Betreff: Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4

Florian> Good morning John,

Florian> yes, I ran Config Advisor tool with latest version, but nothing special is wrong there with the configuration.
Florian> Only thing is, that it is not best practice, which ports we are using for cluster interconnect, or to be more exact:
Florian> We are using e0a to e0d, but not in the right order which port should connect to which switch.

Florian> BR Florian

Florian> ----- Urspr?ngliche Mail -----
Florian> Von: "John Stoffel" <john@stoffel.org>
Florian> An: "Florian Schmid" <fschmid@ubimet.com>
Florian> CC: "toasters" <toasters@teaparty.net>
Florian> Gesendet: Mittwoch, 17. Oktober 2018 21:10:01
Florian> Betreff: Re: Cluster lifs not consistent anymore after upgrade from 9.3 to 9.4

Florian> Have you tried running the Config Advisor tool against your system to
Florian> see if it can find any mis-configurations? It's my best suggestion
Florian> since we haven't made the leap to 9.x yet, management is too
Florian> conservative.

Florian> _______________________________________________
Florian> Toasters mailing list
Florian> Toasters@teaparty.net
Florian> http://www.teaparty.net/mailman/listinfo/toasters

Florian> _______________________________________________
Florian> Toasters mailing list
Florian> Toasters@teaparty.net
Florian> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters