We are using Conserver 7.2.7 to serve about 900 console lines on 25
Cyclades TS terminal concentrators, from a Sun Ultra 5 running Solaris
8. We had no problems at levels of 500-600 lines, but the recent
expansion to 900 lines appears to have led to the following interesting
behavior:
After the server has been up for ten days or so, a few users begin
experiencing timeouts when connecting to a small number of console
lines, viz:
--------------------------------------------------------
myhost$ console beta-15-1
< --- Three minutes of silence --- >
console: connect: 61897@conserver: Connection timed out
--------------------------------------------------------
Logging into the Conserver server, I notice a number of connections to
port 61897 in CLOSE_WAIT state. These entries tend to hang around for a
LONG time (e.g. days):
--------------------------------------------------------------------
conserver# netstat -a|grep 61897
*.61897 *.* 0 0 32768 0 LISTEN
lyell.panasas.com.61897 kinsman.2458 1 0 33304 0 ESTABLISHED
lyell.panasas.com.61897 build-bsd6.1851 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 build-bsd6.1855 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 build-bsd6.1863 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2776 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2778 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2781 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2783 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 kinsman.1984 57920 0 33304 0 CLOSE_WAIT
--------------------------------------------------------------------
One also sees timeouts when using commands such as "console -x"... the
list of connections pauses at a certain point, and eventually times out.
It seems likely that a single Conserver daemon (out of the 55 or so
that are spawned to handle 900 lines) is being affected.
Restarting Conserver is sometimes (but not always) effective in clearing
this up. In many cases, though, the only solution is to reboot the server.
I had previously bumped up certain values in /etc/system (e.g.
"maxusers", "tcp:tcp_conn_hash_size") to better handle the large number
of connections to Conserver, and I'm also planning to install the latest
Solaris patch cluster, in case this is a Solaris TCP/IP issue...
... but I thought I ought to ask the List as well, in case others have
seen this before.
TIA,
S
--
--
steve lammert software engineer voice: +1-412-323-3500
slammert@panasas.com panasas, inc fax: +1-412-323-3511
Cyclades TS terminal concentrators, from a Sun Ultra 5 running Solaris
8. We had no problems at levels of 500-600 lines, but the recent
expansion to 900 lines appears to have led to the following interesting
behavior:
After the server has been up for ten days or so, a few users begin
experiencing timeouts when connecting to a small number of console
lines, viz:
--------------------------------------------------------
myhost$ console beta-15-1
< --- Three minutes of silence --- >
console: connect: 61897@conserver: Connection timed out
--------------------------------------------------------
Logging into the Conserver server, I notice a number of connections to
port 61897 in CLOSE_WAIT state. These entries tend to hang around for a
LONG time (e.g. days):
--------------------------------------------------------------------
conserver# netstat -a|grep 61897
*.61897 *.* 0 0 32768 0 LISTEN
lyell.panasas.com.61897 kinsman.2458 1 0 33304 0 ESTABLISHED
lyell.panasas.com.61897 build-bsd6.1851 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 build-bsd6.1855 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 build-bsd6.1863 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2776 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2778 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2781 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 rack-bsd2.2783 57920 0 33304 0 CLOSE_WAIT
lyell.panasas.com.61897 kinsman.1984 57920 0 33304 0 CLOSE_WAIT
--------------------------------------------------------------------
One also sees timeouts when using commands such as "console -x"... the
list of connections pauses at a certain point, and eventually times out.
It seems likely that a single Conserver daemon (out of the 55 or so
that are spawned to handle 900 lines) is being affected.
Restarting Conserver is sometimes (but not always) effective in clearing
this up. In many cases, though, the only solution is to reboot the server.
I had previously bumped up certain values in /etc/system (e.g.
"maxusers", "tcp:tcp_conn_hash_size") to better handle the large number
of connections to Conserver, and I'm also planning to install the latest
Solaris patch cluster, in case this is a Solaris TCP/IP issue...
... but I thought I ought to ask the List as well, in case others have
seen this before.
TIA,
S
--
--
steve lammert software engineer voice: +1-412-323-3500
slammert@panasas.com panasas, inc fax: +1-412-323-3511