Mailing List Archive

802.3x "flow-control" on NI40G/MG8 ?
Hi,

I'm having weird issues where the Foundry documentation is - uhm
somewhat - lacking. So, perhaps someone here has run into a similar
issue or has some pointers for me in order to wrap my brain around it.

According to the Foundry NI40G docs (well, the "if, then, else" game
in that large PDF), by default "flow-control" (802.3x ethernet PAUSE
frames) is on. But in order those PAUSE frames be send, you need to
set a threshold (% buffers used) for when it will pause, a threshold
for when packet will get dropped, and according to which scheme they
should be dropped. This should all be done with "qd-flow"...

In that same PDF (or the other guides I tried) I cannot find any
mentioning of there being a default setting for "qd-flow", and would
therefore be inclined to believe that without a default, these would
not be send, or at least when the chassis or linecard buffers are
completely depleted.

Now, the fun part is that nowhere within "sh int" or "sh sta" there
appears any mentioning of how many of those "flow-control" / 802.3x /
PAUSE frames were sent or received.

So, does anyone have any pointers as to if these can be found ? I'm
reluctant to consider IronView (I don't like GUI's, and it doesn't fit
our ICT environment for router maintenance) or need to resort to sFlow
(it's still sampled, and I trust it more with L3) , since I'm of
oppinion that this is such a fundamental L2 issue, it should show up
somewhere in the CLI.

Now, because you're wondering why:

The situation where I'm in need of getting that information is as
follows. My NI40G is connected via 10GE to a Nortel ERS8600R. And
we've had - at that time reproducable - issues where at a seemingly
random times (usually every 30 minutes tops, but even as low as after
2) the link would stall on L2. On both ends the link stays up (or at
least doesn't trigger an SNMP trap or SYSLOG messsage), but traffic
doesn't flow (counters don't increase etc.). BGP times out, other L2
(seperate VLAN) traffic stops for >30s...

The only thing we were able to see, was at the Nortel end, where their
Java GUI showed a bunch of "flow-control" packets. And when during
those issues we disabled "flow-control" on the NI40G 10GE interface,
problems ceased (for several hours), and when enable again, they
started shortly.

I've contacted my supplier on this issue, but apart from well, the
most basic "did you do ..." replies from them (and Foundry support I
would gather), they haven't even been able to tell me what the
behaviour of the NI40G should be. In what cases will it send PAUSE
frames, in what cases will it honour received PAUSE frames, etc.

So, any ideas or pointers are very much appreciated.

Kind regards,
JP Velders