Mailing List Archive

Re: Multicast is being switched by LP CPU on MLXe? [Brocade-Confidential]
Email classified by BROCADE as:
Brocade-Confidential

Sorry...haven't had time to keep up with the thread.

I'm not as much of an expert on the CES/CER internal architecture, so my answer would be "that depends". Certain operations (like sFlow forwarding, aggressive SNMP querying, etc.) will drive up the LP CPU; I would look at the averages over 300 seconds though. If you're running at a consistent utilization across a long period and the router isn't performing heavy routing, it might be worth taking a closer look.

Most of the real-world troubleshooting I've done for high LP CPU has involved large multicast networks or strange cases like duplicate MAC addresses. I haven't seen enough customer environments to have a good feel for normal LP utilization looks on a CES in a QnQ/MEF network or acting as a BGP Edge router for a large environment.

High LP situations are always triggered when operations have to be run in the CPU instead of the LP's FPGA. There's plenty of operations that can only run in the CPU, like initial FPGA flow programming, SNMP, sFlow, etc, but issues like flapping router ports and duplicate addresses can cause churn in the FPGA because of constant reprogramming and this will cause the CPU to have to take over some of the traffic forwarding for the LP FPGA.

Hope this background helps.

Wilbur


-----Original Message-----
From: Brian Rak [mailto:brak@gameservers.com]
Sent: Wednesday, March 30, 2016 1:18 PM
To: Wilbur Smith <wsmith@brocade.com>; foundry-nsp@puck.nether.net
Subject: Re: [f-nsp] Multicast is being switched by LP CPU on MLXe?

Does this apply to the CER's as well? We've been seeing high (40%+) CPU from lp on them, but we were assuming that was normal.

On 3/30/2016 1:45 PM, Wilbur Smith wrote:
> High CPU on the LP means that there's something that's preventing hardware programming on the LP's packet processor (FPGA). As unlikely as it sounds, when I usually see this on an LP it's because there's a duplicate multicast source IP or a duplicate MAC address.
>
> Recently, I had a customer using a Linux server as an encoder. An admin just copy and pasted the network ifcfg files of this server onto a new server, but didn't remove or change the MAC address in original config. This created multicast sources with different IPs, but with the same MAC address. There's nothing inside the MLXe that knows there's a duplication (because it's not supposed to happen), so the router is constantly re-programming the Packet Processor back and forth between the two MACs. This constant churn prevents the FID entry from ever being used, so the CPU on the LP tries to handle all of the switching. But...this may not be a multicast issue. If you still see high CPU after you disable IGMP Snooping, I could temporarily disable the multicast source (if you can) and see if this reduces the CPU.
>
> When everything is working correctly, you shouldn't see any real CPU utilization on an LP over a 1 min average. I have a customer pushing 1.2 Terabits of multicast on an original MLX-32 with all LPs running at 1-2%, so the CPU level isn't related to the amount of traffic. What you want to keep in mind is that high LP CPU is usually a symptom of lots of churning of the FID tables on the Packet Processor, so work back from there and think about what could cause that. If you have high matching CPU on the Management Modules? Do you see a spike in the L2 process when you run a Show CPU Proc? Is there a possibility of a L2 loop somewhere? Do you have another process like BGP, OSPF, or STP using a lot of CPU cycles?
>
> Hope this helps as a starting point.
>
> Wilbur
>
> -----Original Message-----
> From: foundry-nsp [mailto:foundry-nsp-bounces@puck.nether.net] On
> Behalf Of Alexander Shikoff
> Sent: Tuesday, March 29, 2016 6:20 AM
> To: foundry-nsp@puck.nether.net
> Subject: [f-nsp] Multicast is being switched by LP CPU on MLXe?
>
> Hello!
>
> I have some VLANs configured between certain ports on MLXe box (MLXe-16, 5.7.0e).
> All ports are in 'no route-only' mode. For example:
>
> telnet@lsr1-gdr.ki(config)#show vlan 720
>
> PORT-VLAN 720, Name V720, Priority Level 0, Priority Force 0, Creation Type STATIC
> Topo HW idx : 65535 Topo SW idx: 257 Topo next vlan: 0
> L2 protocols : NONE
> Statically tagged Ports : ethe 7/1 ethe 9/5 ethe 10/6 ethe 11/6 ethe 13/3
> Associated Virtual Interface Id: NONE
> ----------------------------------------------------------
> Port Type Tag-Mode Protocol State
> 7/1 TRUNK TAGGED NONE FORWARDING
> 9/5 TRUNK TAGGED NONE FORWARDING
> 10/6 TRUNK TAGGED NONE FORWARDING
> 11/6 TRUNK TAGGED NONE FORWARDING
> 13/3 PHYSICAL TAGGED NONE FORWARDING
> Arp Inspection: 0
> DHCP Snooping: 0
> IPv4 Multicast Snooping: Enabled - Passive
> IPv6 Multicast Snooping: Disabled
>
> No Virtual Interfaces configured for this vlan
>
>
>
> As you may notice, passive multicast snooping is enabled on that VLAN.
> The problem is that multicast traffic is being switched by LP CPU, causing high CPU utilization and packet loss.
>
> It is clearly seen on rconsole:
>
> LP-11#debug packet capture rx include src-port me/6 vlan-id 720
> dst-address 233.191.133.96 [...]
> **********************************************************************
> [ppcr_tx_packet] ACTION: Forward packet using fid 0xa06d
> [ppcr_rx_packet]: Packet received
> Time stamp : 56 day(s) 20h 48m 05s:,
> TM Header: [ 0564 0aa3 0080 ]
> Type: Fabric Unicast(0x00000000) Size: 1380 Class: 0 Src sys port:
> 2723 Dest Port: 0 Drop Prec: 2 Ing Q Sig: 0 Out mirr dis: 0x0 Excl
> src: 0 Sys mc: 0
> **********************************************************************
> Packet size: 1374, XPP reason code: 0x000095cc
> 00: 05f0 0003 5c50 02d0-7941 fffe 8000 0000 FID = 0x05f0
> 10: 0100 5e3f 85b0 e4d3-f17d a7c5 0800 4516 Offset = 0x10
> 20: 0540 0000 4000 7e11-f89f b060 df26 e9bf VLAN = 720(0x02d0)
> 30: 85b0 04d2 04d2 052c-0000 4701 3717 8134 CAM = 0x0ffff(R)
> 40: 01d0 92de c56f 18f6-dc4f 8d00 1273 cdb3 SFLOW = 0
> 50: c3ff 3da8 2600 5a37-cfbe 993f dbfd c927 DBL TAG = 0
> 60: 8000 8290 ef9b 7638-9089 9a50 5000 8611
> 70: 2026 0079 8de2 a404-1013 dffd 04e0 1404
> Pri CPU MON SRC PType US BRD DAV SAV DPV SV ER TXA SAS Tag MVID
> 0 0 0 11/6 3 0 1 0 1 1 1 0 0 0 1 0
>
> 176.96.223.38 -> 233.191.133.176 UDP [1234 -> 1234]
> **********************************************************************
>
> As far as I understand documentation that should not happen.
> The situation remains the same if I disable IGMP snooping.
>
> Any ideas/suggestions is kindly appreciated!
>
> Thanks in advance.
>
> --
> MINO-RIPE
> _______________________________________________
> foundry-nsp mailing list
> foundry-nsp@puck.nether.net
> https://urldefense.proofpoint.com/v2/url?u=http-3A__puck.nether.net_ma
> ilman_listinfo_foundry-2Dnsp&d=CwICAg&c=IL_XqQWOjubgfqINi2jTzg&r=l86Fj
> -WC0GHHSCjQjuUvTzxOj0iW25AHL3VIC5Dog8o&m=WPk4eVQmFKiXCfefjUdaL3F5XLDDJ
> iGTkS5lzLjgtHk&s=CIQiBvRtCiKC8nH_4Cm4bNCUKrkySL6Zvqi9MEzYn7s&e=
> _______________________________________________
> foundry-nsp mailing list
> foundry-nsp@puck.nether.net
> https://urldefense.proofpoint.com/v2/url?u=http-3A__puck.nether.net_ma
> ilman_listinfo_foundry-2Dnsp&d=CwIC-g&c=IL_XqQWOjubgfqINi2jTzg&r=l86Fj
> -WC0GHHSCjQjuUvTzxOj0iW25AHL3VIC5Dog8o&m=BVhWkJrTJ-MzMMHIzbkkFS_Fz9CbT
> DxoloCNHXt02YE&s=amQ7tP4g4wMxadZDtEJHrXTIc-H3yyoQprfgKzwvtLE&e=

_______________________________________________
foundry-nsp mailing list
foundry-nsp@puck.nether.net
http://puck.nether.net/mailman/listinfo/foundry-nsp