Mailing List Archive

Odd ctrl-s problem.
Recently, we restarted conserver, (Killed the parent, waited for the
children to exit) - and then started conserver again.

Somewhere between shutting conserver down, or starting it up again, a
significant number of our machines had "ctrl s" sent to the console,
blocking all future /dev/console output. This in turn then caused some
apps that were writing log messages to the console to then block, and
to stop working.

The only common thing we have noticed so far is that the only machines
that seem to have suffered, are the ones we make an ssh connection with.
(We use exec "ssh -c 3des user:port@hostname", followed by an initcmd
which echo's the password to the console).

Has anyone seen of this issue before? Obviously, I have no idea if it
was conserver, ssh, or the terminal server that caused this - but it
happened at the time conserver was restarted.

As for possible workarounds, does anyone see an issue with sending a ^s
with in the "idlestring" ?

Many Thanks
Pete
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Odd ctrl-s problem. [ In reply to ]
On 6-Feb-08, at 6:00 AM, Peter Saunders wrote:

> Recently, we restarted conserver, (Killed the parent, waited for the
> children to exit) - and then started conserver again.
>
> Somewhere between shutting conserver down, or starting it up again, a
> significant number of our machines had "ctrl s" sent to the console,
> blocking all future /dev/console output. This in turn then caused some
> apps that were writing log messages to the console to then block, and
> to stop working.
>
> The only common thing we have noticed so far is that the only machines
> that seem to have suffered, are the ones we make an ssh connection
> with.
> (We use exec "ssh -c 3des user:port@hostname", followed by an initcmd
> which echo's the password to the console).
>
> Has anyone seen of this issue before? Obviously, I have no idea if it
> was conserver, ssh, or the terminal server that caused this - but it
> happened at the time conserver was restarted.

The XOFF issue is an oldie for sure. Back when I had Decwriter-III's
on serial consoles I had the habit of always hitting <CTRL-Q> every
morning just in case they had been overflowed and stopped overnight.

The worst is when the kernel obeys XON/XOFF and then it gets hung up
entirely stopping the whole system. This was the case on SunOS, at
least up to 5.9.

It's 99.999% certain that it was the terminal server that caused it,
since it may have been configured to try to pause the output from the
attached device once its input buffer filled, and if its flow control
method is set to XON/XOFF then it would use ^s to pause the output
just as you've observed.

On those old Xyplex MaxServers (which is what I'm running at home
now), and perhaps on the DECservers too since they seem to run code
derived from the same origin, there are several ways of dealing with
device output when there's no connection open to send it down. One
is to increase the typeahead size to a ridiculous amount (assuming
you have enough RAM installed in the maxserver). That's what I've done:



Port 2: (Remote) 07 Feb 2008
11:47:54

Resolve Service: Telnet DTR wait:
Disabled
Idle Timeout: 0 Typeahead
Size: 2048
SLIP Address: 0.0.0.0 SLIP Mask:
255.255.255.255
Remote SLIP Addr: 0.0.0.0 Default Session Mode:
Interactive
TCP Window Size: 256 Prompt:
Xyplex
DCD Timeout: 2000 Dialback
Timeout: 20
Stop Bits: 1 Script Login:
Disabled
TCP Keepalive Timer: 0 Username
Filtering: None
Nested Menu: Disabled Nested Menu Top
Level: 0
Command Size: 80 Clear Security Entries:
Disabled
Rlogin Transparent Mode: Disabled Login
Duration: 0
Xon Send Timer: 0 TCP Outbound Address:
0.0.0.0
Slip Autosend: Disabled Radius Accounting:
Disabled



Username Prompt: Enter username>
Password Prompt: Enter user password>


> As for possible workarounds, does anyone see an issue with sending
> a ^s
> with in the "idlestring" ?

That's a better idea than anything else I had thought of so far! :-)

(All I had thought of was sending a ^s with "chat" through initcmd,
but of course that only fixes the problem on startup, not if it
occurs regularly during normal use.)

--
Greg A. Woods
<woods@weird.com>

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
RE: Odd ctrl-s problem. [ In reply to ]
Well, it's been a long while since I've had to mess with older terminal
servers, but I seem to recall that some flavors of decservers for
example allowed you to control the pass-through of various control
characters to the port. Specifically, you could control whether things
like BREAK, control-x, control-s, etc got passed through or trapped.

I think. Might be worth looking into?

---------------------------------------------------------
Kent C. Brodie - brodie@mcw.edu
Department of Physiology
Medical College of Wisconsin
(414) 456-8590
-----Original Message-----
From: users-bounces@conserver.com [mailto:users-bounces@conserver.com]
On Behalf Of Greg A. Woods
Sent: Wednesday, February 06, 2008 10:02 AM
To: Peter Saunders
Cc: Conserver Users
Subject: Re: Odd ctrl-s problem.


On 6-Feb-08, at 6:00 AM, Peter Saunders wrote:

> Recently, we restarted conserver, (Killed the parent, waited for the
> children to exit) - and then started conserver again.
>
> Somewhere between shutting conserver down, or starting it up again, a
> significant number of our machines had "ctrl s" sent to the console,
> blocking all future /dev/console output. This in turn then caused some
> apps that were writing log messages to the console to then block, and
> to stop working.
>
> The only common thing we have noticed so far is that the only machines
> that seem to have suffered, are the ones we make an ssh connection
> with.
> (We use exec "ssh -c 3des user:port@hostname", followed by an initcmd
> which echo's the password to the console).
>
> Has anyone seen of this issue before? Obviously, I have no idea if it
> was conserver, ssh, or the terminal server that caused this - but it
> happened at the time conserver was restarted.

The XOFF issue is an oldie for sure. Back when I had Decwriter-III's
on serial consoles I had the habit of always hitting <CTRL-Q> every
morning just in case they had been overflowed and stopped overnight.

The worst is when the kernel obeys XON/XOFF and then it gets hung up
entirely stopping the whole system. This was the case on SunOS, at
least up to 5.9.

It's 99.999% certain that it was the terminal server that caused it,
since it may have been configured to try to pause the output from the
attached device once its input buffer filled, and if its flow control
method is set to XON/XOFF then it would use ^s to pause the output
just as you've observed.

On those old Xyplex MaxServers (which is what I'm running at home
now), and perhaps on the DECservers too since they seem to run code
derived from the same origin, there are several ways of dealing with
device output when there's no connection open to send it down. One
is to increase the typeahead size to a ridiculous amount (assuming
you have enough RAM installed in the maxserver). That's what I've done:



Port 2: (Remote) 07 Feb 2008
11:47:54

Resolve Service: Telnet DTR wait:
Disabled
Idle Timeout: 0 Typeahead
Size: 2048
SLIP Address: 0.0.0.0 SLIP Mask:
255.255.255.255
Remote SLIP Addr: 0.0.0.0 Default Session Mode:
Interactive
TCP Window Size: 256 Prompt:
Xyplex
DCD Timeout: 2000 Dialback
Timeout: 20
Stop Bits: 1 Script Login:
Disabled
TCP Keepalive Timer: 0 Username
Filtering: None
Nested Menu: Disabled Nested Menu Top
Level: 0
Command Size: 80 Clear Security Entries:
Disabled
Rlogin Transparent Mode: Disabled Login
Duration: 0
Xon Send Timer: 0 TCP Outbound Address:
0.0.0.0
Slip Autosend: Disabled Radius Accounting:
Disabled



Username Prompt: Enter username>
Password Prompt: Enter user password>


> As for possible workarounds, does anyone see an issue with sending
> a ^s
> with in the "idlestring" ?

That's a better idea than anything else I had thought of so far! :-)

(All I had thought of was sending a ^s with "chat" through initcmd,
but of course that only fixes the problem on startup, not if it
occurs regularly during normal use.)

--
Greg A. Woods
<woods@weird.com>

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Odd ctrl-s problem. [ In reply to ]
On 6-Feb-08, at 11:31 AM, Brodie, Kent wrote:

> Well, it's been a long while since I've had to mess with older
> terminal
> servers, but I seem to recall that some flavors of decservers for
> example allowed you to control the pass-through of various control
> characters to the port. Specifically, you could control whether
> things
> like BREAK, control-x, control-s, etc got passed through or trapped.

The problem here is definitely not with pass-through of control
characters.

It's more to do with the terminal server generating control servers
in its own attempt to control the data flowing from the attached device.

Disabling flow control on the port could probably prevent the problem
from happening, however then flow control during normal use would
then be impossible (pass-through of flow control characters would not
have the desired effect -- they don't get passed through SSH and
TELNET connections in the way you would want them to be transmitted
since buffering in the various network layers would defeat any
attempt to use a raw connection).

Proper flow control for interactive use requires that the terminal
server perform flow control directly itself (and that the various
network layers use whatever mechanisms they have to do flow control
properly, right down to the connection to the attached device).

Eg. you want output to stop almost immediately when you hit ^S but
you don't want anything to be lost. That means the final output
device in front of the user (eg. xterm) interpret the ^S from the
user and immediately stop generating output, while at the same time
pushing the flow control request back through the various layers
(CONSERVER -> SSH -> TELNET -> RS232 or whatever) so that eventually
a flow control request reaches the device generating the data in the
appropriate form and that all buffered data is preserved in all the
various layers in anticipation of the user hitting ^Q to see some
more (or that it all be flushed if the user hits ^C or whatever).
Note that this may sometimes involve translating the flow control
request into a hardware signal change on the RS323 line, such as de-
asserting CTS.

Note that flow control may have to work properly though all the
layers for more than just interactive uses too. If you don't want
data from your attached devices to be lost by conserver in its logs,
for example, then you need fully working flow control back through
all the layers to the attached devices. If you don't have fully
working flow control through all layers then something like a minor
network glitch may cause a buffer to fill and all data between that
time and the draining of the buffer to be lost forever.

--
Greg A. Woods
<woods@weird.com>



_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Odd ctrl-s problem. [ In reply to ]
Many thanks for your replies,

I still haven't got to the bottom of what actually caused it, but, it
seems odd if was the terminal server. Its a reasonably recent event that
conserver went live on these machines - and in the past, the terminal
servers didn't have a connection to them from the network, so, all the
serial traffic was silently thrown away. This is what I was expecting to
happen during the conserver restart window. (People used to ssh to the
terminal server only when they needed the console)

However, I think i'll change the the idlestring to contain ^q - so in
the event of a restart causing it again, at least after 5 minutes of
inactivity, conserver would sent a ctrl-q to it again. (Assuming I can
get it to do this?)

Cheers
Pete

On Wed, Feb 06, 2008 at 12:42:25PM -0500, Greg A. Woods wrote:
>
> On 6-Feb-08, at 11:31 AM, Brodie, Kent wrote:
>
> > Well, it's been a long while since I've had to mess with older
> > terminal
> > servers, but I seem to recall that some flavors of decservers for
> > example allowed you to control the pass-through of various control
> > characters to the port. Specifically, you could control whether
> > things
> > like BREAK, control-x, control-s, etc got passed through or trapped.
>
> The problem here is definitely not with pass-through of control
> characters.
>
> It's more to do with the terminal server generating control servers
> in its own attempt to control the data flowing from the attached device.
>
> Disabling flow control on the port could probably prevent the problem
> from happening, however then flow control during normal use would
> then be impossible (pass-through of flow control characters would not
> have the desired effect -- they don't get passed through SSH and
> TELNET connections in the way you would want them to be transmitted
> since buffering in the various network layers would defeat any
> attempt to use a raw connection).
>
> Proper flow control for interactive use requires that the terminal
> server perform flow control directly itself (and that the various
> network layers use whatever mechanisms they have to do flow control
> properly, right down to the connection to the attached device).
>
> Eg. you want output to stop almost immediately when you hit ^S but
> you don't want anything to be lost. That means the final output
> device in front of the user (eg. xterm) interpret the ^S from the
> user and immediately stop generating output, while at the same time
> pushing the flow control request back through the various layers
> (CONSERVER -> SSH -> TELNET -> RS232 or whatever) so that eventually
> a flow control request reaches the device generating the data in the
> appropriate form and that all buffered data is preserved in all the
> various layers in anticipation of the user hitting ^Q to see some
> more (or that it all be flushed if the user hits ^C or whatever).
> Note that this may sometimes involve translating the flow control
> request into a hardware signal change on the RS323 line, such as de-
> asserting CTS.
>
> Note that flow control may have to work properly though all the
> layers for more than just interactive uses too. If you don't want
> data from your attached devices to be lost by conserver in its logs,
> for example, then you need fully working flow control back through
> all the layers to the attached devices. If you don't have fully
> working flow control through all layers then something like a minor
> network glitch may cause a buffer to fill and all data between that
> time and the draining of the buffer to be lost forever.
>
> --
> Greg A. Woods
> <woods@weird.com>
>
>
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Odd ctrl-s problem. [ In reply to ]
you can just use:

idlestring "^Q";

(a literal carat and Q - two characters)

conserver should only be sending characters it's told to send (the only
exception i can think of is when it's doing telnet option negotiation).
the ssh command is being run withing a pseudo-tty and those layers
*might* be doing something. for example, there could be things buried
in shell startup scripts (since conserver cranks off a /bin/sh to
actually run the command) or some unexpected stty setting on the
pseudo-tty.

i'm lacking on any concrete ideas, however. well, aside from using
truss/strace and seeing if there are ctrl-s characters flying around.

Bryan

On Thu, Feb 07, 2008 at 10:42:44AM +0000, Peter Saunders wrote:
> Many thanks for your replies,
>
> I still haven't got to the bottom of what actually caused it, but, it
> seems odd if was the terminal server. Its a reasonably recent event that
> conserver went live on these machines - and in the past, the terminal
> servers didn't have a connection to them from the network, so, all the
> serial traffic was silently thrown away. This is what I was expecting to
> happen during the conserver restart window. (People used to ssh to the
> terminal server only when they needed the console)
>
> However, I think i'll change the the idlestring to contain ^q - so in
> the event of a restart causing it again, at least after 5 minutes of
> inactivity, conserver would sent a ctrl-q to it again. (Assuming I can
> get it to do this?)
>
> Cheers
> Pete
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Odd ctrl-s problem. [ In reply to ]
A follow up..

I did some testing after this problem occured again.

In 1 window:
while true; do pkill conserver; conserver start; sleep 7; done
(some times for testing, make it a pkill -SEGV - to simulate when
conserver crashes)

In other window
while true; do console <hostname>; done

On the host itself
while true; do date > /dev/console; sleep 1; done

After about 20-30 conserver restarts - the end host required a ctrl q to
send output again.

So, my next test. I created a file which contained ^Eco to force a
console restart.

while true; do
cat /tmp/eco | console <hostname>
sleep 5
done

Again, after about 20-30 restarts, flow contol has occured again.

For my last test, I took conserver out of the equation, and wrote an
expect script that connected to the console directly:

#!/opt/sfw/bin/expect -f
spawn /usr/bin/ssh -o StrictHostKeyChecking=no -o ForwardX11=no -c 3des user:port@terminalserver
set timeout 60
expect -nocase -re "password:"
send -- "PASSWORD\n"

expect -nocase -re "\n"
expect -nocase -re "\n"
expect -nocase -re "\n"

exit

And ran this in a while true loop. (So doing exactly what conserver
would do, spawn an ssh connection and reading it). If it was a terminal
server bug, i would expect this to behave the same as conserver, e.g. on
the 20-30 time, need a ctrl q. However, this ran every time, never
having an issue. It just kept working forever.

So, it does look like it is something in conserver causing this to
happen, unfortunately intermittently.

I tried setting the default "options" to:

options "!ixon,!ixoff,autoreinit,reinitoncc";

and even

options "ixon,ixoff,autoreinit,reinitoncc";

However, this made no difference.

Getting a ^Q sent at startup (in the initcmd) does stop this happening,
as does sending it in the idlestring. So I have a workaround to my
specific issue, but it would be intresting to know why this happens at
all. Any thoughts?

Thanks
Pete


On Thu, Feb 07, 2008 at 10:42:44AM +0000, Peter Saunders wrote:
> Many thanks for your replies,
>
> I still haven't got to the bottom of what actually caused it, but, it
> seems odd if was the terminal server. Its a reasonably recent event that
> conserver went live on these machines - and in the past, the terminal
> servers didn't have a connection to them from the network, so, all the
> serial traffic was silently thrown away. This is what I was expecting to
> happen during the conserver restart window. (People used to ssh to the
> terminal server only when they needed the console)
>
> However, I think i'll change the the idlestring to contain ^q - so in
> the event of a restart causing it again, at least after 5 minutes of
> inactivity, conserver would sent a ctrl-q to it again. (Assuming I can
> get it to do this?)
>
> Cheers
> Pete
>
> On Wed, Feb 06, 2008 at 12:42:25PM -0500, Greg A. Woods wrote:
> >
> > On 6-Feb-08, at 11:31 AM, Brodie, Kent wrote:
> >
> > > Well, it's been a long while since I've had to mess with older
> > > terminal
> > > servers, but I seem to recall that some flavors of decservers for
> > > example allowed you to control the pass-through of various control
> > > characters to the port. Specifically, you could control whether
> > > things
> > > like BREAK, control-x, control-s, etc got passed through or trapped.
> >
> > The problem here is definitely not with pass-through of control
> > characters.
> >
> > It's more to do with the terminal server generating control servers
> > in its own attempt to control the data flowing from the attached device.
> >
> > Disabling flow control on the port could probably prevent the problem
> > from happening, however then flow control during normal use would
> > then be impossible (pass-through of flow control characters would not
> > have the desired effect -- they don't get passed through SSH and
> > TELNET connections in the way you would want them to be transmitted
> > since buffering in the various network layers would defeat any
> > attempt to use a raw connection).
> >
> > Proper flow control for interactive use requires that the terminal
> > server perform flow control directly itself (and that the various
> > network layers use whatever mechanisms they have to do flow control
> > properly, right down to the connection to the attached device).
> >
> > Eg. you want output to stop almost immediately when you hit ^S but
> > you don't want anything to be lost. That means the final output
> > device in front of the user (eg. xterm) interpret the ^S from the
> > user and immediately stop generating output, while at the same time
> > pushing the flow control request back through the various layers
> > (CONSERVER -> SSH -> TELNET -> RS232 or whatever) so that eventually
> > a flow control request reaches the device generating the data in the
> > appropriate form and that all buffered data is preserved in all the
> > various layers in anticipation of the user hitting ^Q to see some
> > more (or that it all be flushed if the user hits ^C or whatever).
> > Note that this may sometimes involve translating the flow control
> > request into a hardware signal change on the RS323 line, such as de-
> > asserting CTS.
> >
> > Note that flow control may have to work properly though all the
> > layers for more than just interactive uses too. If you don't want
> > data from your attached devices to be lost by conserver in its logs,
> > for example, then you need fully working flow control back through
> > all the layers to the attached devices. If you don't have fully
> > working flow control through all layers then something like a minor
> > network glitch may cause a buffer to fill and all data between that
> > time and the draining of the buffer to be lost forever.
> >
> > --
> > Greg A. Woods
> > <woods@weird.com>
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users


--
Pete
"Money doesn't make you happy, but money can
buy gizmos, and gizmos make you happy"
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users