Mailing List Archive

[nsp] AS53192 woe
OK here is a mindbender. Working on a AS53192 with 12.2.2XB7 code and the
latest (2.9.4.0) portware. Things are not going too well, and this looks
like a potential software problem.

Many users log in and only transfer 181 octets and get a "zombie" session
where their IP address doesn't show up in the route table with show ip
route and they can't transfer any data. They show up with show caller ip,
but the IP isn't pingable, and isn't installed when you so a show ip
route. They disconnect, and log back in, and voila, they are online.

We were pulling our hair out trying to track this to a client modem
incompatibility or something, and it wasn't until we did a complicated
RADIUS database query that we happened to find out that this phenomena is
tracking the modem/line itself, not the IP address or anything else.

A show modem (abbreviated) shows this:

Avg Hold Inc calls Out calls Busied Failed No Succ
Mdm Time Succ Fail Succ Fail Out Dial Answer Pct.
1/0 00:08:37 382 17 0 0 0 0 0 96%
1/1 00:30:41 201 9 0 0 0 0 0 96%

It turns out, modem 1/0 (line 1) is bad. In that a high percentage of the
calls (according to RADIUS) it takes result in these zombie
sessions. According to the database, 45 of these calls (at least) were
zombies. Yet, the show modem test log is clean for this modem. If you look
above, it took 382 calls whereas its neighbor took 201 calls. If you look
through the rest of the show modem it turns out that any modem that took a
disproportionate number of calls relative to its neighbors also has this
zombie problem.

In all, there are 22 modems scattered across the 192 modems in the box that
have this problem. For now, I did something like a line 1 modem busyout to
kill the modem entirely.

But what could be causing this? These modems aren't going into the classic
B state to indicate they are bad. So modem recovery never kicks in to try
to revive the modem with a download of the firmware. Also, the show
controller T1 statements show zero errors (which makes sense because the
box is colocated with the telco, there is no miles of underground copper to
content with, it's all fiber to a mux to the 53192).

I need some ideas on things to look at.

Thanks,

Chris