Mailing List Archive

lines going "down"
Hi all. We were using conserver 7.0.0 for several months, but ports
seemed to get stuck too often, forcing me to restart the daemon, so I
upgraded to 7.1.3. Unfortunately, things have been worse under this
version. More often than not, ports suddenly go "down", and restarting
the server makes some ports come up and others go down.

Is there some better in-bewteen version I should try? I'm starting to
get frustrated.

oracle1 down <none>
pbxp1 down <none>
sunray1 up <none>
sunray2 up <none>
cisco up <none>
jtsd07 up <none>
jtsd08 up <none>
jtsd09 up <none>
toro up <none>
shasta down <none>
tahoe down <none>
vodavi up <none>

Right now, the four shown "down" should be "up". Five minutes ago,
jtsd0[789] showed "down" for no explicable reason. 'netstat' shows no
active connection between the conserver and the terminal server for the
"down" ports. ^Eco just comes back with "line is still down".

We are using a Portmaster PM2e. Any help or suggestions would be
appreciated. Thank you.
RE: lines going "down" [ In reply to ]
We saw a problem this week where the lines would not come up no matter
what. After an hour of coaxing (stoping the daemon and restarting it,
stopping and restarting...) they finally came up. Never went through that
before, and they seem stable since then. Have to keep an eye on this...

Ernie


> -----Original Message-----
> From: users-admin@conserver.com [mailto:users-admin@conserver.com]On
> Behalf Of Jim Gottlieb
> Sent: Tuesday, November 27, 2001 7:22 PM
> To: users@conserver.com
> Subject: lines going "down"
>
>
> Hi all. We were using conserver 7.0.0 for several months, but ports
> seemed to get stuck too often, forcing me to restart the daemon, so I
> upgraded to 7.1.3. Unfortunately, things have been worse under this
> version. More often than not, ports suddenly go "down", and restarting
> the server makes some ports come up and others go down.
>
> Is there some better in-bewteen version I should try? I'm starting to
> get frustrated.
>
> oracle1 down <none>
> pbxp1 down <none>
> sunray1 up <none>
> sunray2 up <none>
> cisco up <none>
> jtsd07 up <none>
> jtsd08 up <none>
> jtsd09 up <none>
> toro up <none>
> shasta down <none>
> tahoe down <none>
> vodavi up <none>
>
> Right now, the four shown "down" should be "up". Five minutes ago,
> jtsd0[789] showed "down" for no explicable reason. 'netstat' shows no
> active connection between the conserver and the terminal server for the
> "down" ports. ^Eco just comes back with "line is still down".
>
> We are using a Portmaster PM2e. Any help or suggestions would be
> appreciated. Thank you.
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users
>
Re: lines going "down" [ In reply to ]
I've seen a similar scenario in one particular lab, using a
Cisco 3640 with NM-32A cards. I don't think this is a brand
issue. I merely offer it as another clue....

When we see the failure, we typically see 8 ports in a group
go down...all 8 in a modulo-8 group. (i.e. 1-8, 17-24, etc.)
All of the affected lines are run by the same OCTART chip.

While I could point to a failure in IOS for this (which
would only be circumstantial and unsupported by fact), I
actually have another working theory, based on looking at
the devices attached...

In these cases, there was usually a network interruption
between the conserver and the console server. This could be
a switch/router failure in the network, or a forced reboot
of the conserver host without a polite shutdown...and the
devices showing 'down' were what I call 'quiet hosts'. (A
quiet host is a device that only replies when you talk to
it...it doesn't usually offer any log traffic, time stamps,
etc. to the logs unless someone is typing to it.)

In the case of a network break like this, the TCP session
to all of the ports (from Conserver to the Console Server)
don't get cleared out when the connectivity failure occurs!
Since the host doesn't generate any traffic on the serial
port, the console server never tries to send traffic to the
conserver host, and the console server leaves the session
open, thinking that the conserver host is just idle. The
root cause here is that the TCP FIN sequence never occured.
So, when you restart your Conserver, and it tries to then
connect to these ports on the console server, the console
server tells the conserver that the TCP port is busy (since
the console server still thinks the old session is still
there and idle...)

In these cases, our cure has been to log into the console
server, and reset each affected line, one by one. This will
blow away the (already broken) TCP session, and allow you to
either restart your conserver, or just force open each of
the lines that were down.

While this doesn't happen too often in the data centers,
I have seen this in some of the remote locations. Maybe
that's another good argument for having a distributed
Conserver deployment, and putting a logging host 'closer'
to the console servers? :-)

Regards,

-Z- http://www.conserver.com/consoles/breakoff.html