Mailing List Archive

conserver -> cyclade, ssh timeouts
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey folks,

So I've brought conserver to my next company - or at least I'm trying to. ;)

I'm having a problem where the ssh connection from conserver to the
cyclades times out, and the ssh client sees it happen, and thus
conserver sees it happen - but the cyclade never sees this happen, and
thus the sock_sshd for that port lives on and the cyclade can never connect.

I know this is more of a cyclades question and less of a conserver
question - but I figure this is probably one of the best places to ask
if anyone's run into this.

Presumably the connection dies because there are firewalls in between
the two boxes that limit inactive state timeouts.

I've tried turning on keep alives on both sides... but haven't solved
the problem...

Thoughts?

- --
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDSxZ39q0UmHR94IoRAqX3AJ0dWcwHJbS1qUCuEqAgNNcTSh7I3QCdEal9
1jfKR2mDmSxW33mcgxB6/38=
=RAtv
-----END PGP SIGNATURE-----
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
Tune the keep alives to be more often. I think the default is 7200
seconds. Make it 300?

I had the same issue with VPN tunnels between devices that needed to
stay connected but were quiet. I told pppd to send an echo packet every
60 seconds. That solved that problem.

Some firewalls/routers may be intelligent enough to see a keep alive and
not count it. I'm not a pro on that subject so I'm not sure.

What concerns me most is why the Cyclades box does not see the failed
connection. when this happens are you basically locked out of that port
until you do something like a reboot on the Cyclades? If so I would
call support and ask if there is a patch that fixes this.


On Mon, 2005-10-10 at 18:33 -0700, Phil Dibowitz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hey folks,
>
> So I've brought conserver to my next company - or at least I'm trying to. ;)
>
> I'm having a problem where the ssh connection from conserver to the
> cyclades times out, and the ssh client sees it happen, and thus
> conserver sees it happen - but the cyclade never sees this happen, and
> thus the sock_sshd for that port lives on and the cyclade can never connect.
>
> I know this is more of a cyclades question and less of a conserver
> question - but I figure this is probably one of the best places to ask
> if anyone's run into this.
>
> Presumably the connection dies because there are firewalls in between
> the two boxes that limit inactive state timeouts.
>
> I've tried turning on keep alives on both sides... but haven't solved
> the problem...
>
> Thoughts?
>
> - --
> Phil Dibowitz
> P: 310-360-2330 C: 213-923-5115
> Unix Admin, Ticketmaster.com
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
> iD8DBQFDSxZ39q0UmHR94IoRAqX3AJ0dWcwHJbS1qUCuEqAgNNcTSh7I3QCdEal9
> 1jfKR2mDmSxW33mcgxB6/38=
> =RAtv
> -----END PGP SIGNATURE-----
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christopher Fowler wrote:
> Tune the keep alives to be more often. I think the default is 7200
> seconds. Make it 300?
>
> I had the same issue with VPN tunnels between devices that needed to
> stay connected but were quiet. I told pppd to send an echo packet every
> 60 seconds. That solved that problem.
>
> Some firewalls/routers may be intelligent enough to see a keep alive and
> not count it. I'm not a pro on that subject so I'm not sure.
>
> What concerns me most is why the Cyclades box does not see the failed
> connection. when this happens are you basically locked out of that port
> until you do something like a reboot on the Cyclades? If so I would
> call support and ask if there is a patch that fixes this.

I don't have to reboot, but yes, I'm locked out of the port until I log
in and kill the sock_sshd process attached to that port.

- --
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDSxqj9q0UmHR94IoRAjq/AJ91LWMVmOCCBWjosJk5SHlGdIXOKgCgkxk6
s3sMjqZ04UEop4o3ZN9BXi0=
=Z4KU
-----END PGP SIGNATURE-----
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
On Mon, Oct 10, 2005 at 09:49:38PM -0400, Christopher Fowler wrote:
> Tune the keep alives to be more often. I think the default is 7200
> seconds. Make it 300?
>
> I had the same issue with VPN tunnels between devices that needed to
> stay connected but were quiet. I told pppd to send an echo packet every
> 60 seconds. That solved that problem.
>
> Some firewalls/routers may be intelligent enough to see a keep alive and
> not count it. I'm not a pro on that subject so I'm not sure.
>
> What concerns me most is why the Cyclades box does not see the failed
> connection. when this happens are you basically locked out of that port
> until you do something like a reboot on the Cyclades? If so I would
> call support and ask if there is a patch that fixes this.
>
> On Mon, 2005-10-10 at 18:33 -0700, Phil Dibowitz wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > I'm having a problem where the ssh connection from conserver to the
> > cyclades times out, and the ssh client sees it happen, and thus
> > conserver sees it happen - but the cyclade never sees this happen, and
> > thus the sock_sshd for that port lives on and the cyclade can never connect.
> >
> > I know this is more of a cyclades question and less of a conserver
> > question - but I figure this is probably one of the best places to ask
> > if anyone's run into this.
> >
> > Presumably the connection dies because there are firewalls in between
> > the two boxes that limit inactive state timeouts.
> >
> > I've tried turning on keep alives on both sides... but haven't solved
> > the problem...

I'm with Charlie on this one...when Phil says "the ssh client sees it
happen", does this mean that the SSH session from the Conserver host to
the Cyclades port recieved a FIN, to close the session? If so, can you
tell if the firewall sent the FIN (maybe because it was going to drop
an idle connection?), or did the SSH client itself timeout, and try to
send a FIN to the Cyclades, while closing the connection to Conserver?
Maybe the firewall has closed the idle-looking session by then.

In the 'plain vanilla' telnet world, it was pretty common to see a
reverse TCP session (or many) be initialized, and then have the host
crash, or otherwise need to be rebooted. At that point, the host had
no chance to send a FIN to any of the reverse TCP sessions prior to
the reboot, and all of those sessions were lost in the reboot.

When the host recovered, and tried to re-make the reverse TCP
sessions, it woudl be refused, as the sessions were already in use...

On the console server, the session is still established. Just because
it hasn't heard from the host doesn't mean there is a problem. BUT, if
something comes in the serial port, and the console server tries to send
it to the coresponding session on the host, the host never answers, and
the Console Server will close *that* session because the host was not
answering. If nothing comes in the serial port, to stimulate that action,
the session could stay up indefinitely.

In these cases, you woul dneed to log into the conosle server, attain
elevated privileges, and "clear" or "reset" the line(s) in question.
An ugly solution, but the alternative is rebooting the Console Server.
(Clearing the lines manually will preserve the uptime on your Console
Server, if such metrics are important to you. ;-)

I like the 60-second timeout. A bit chatty, but not bad. Which firewall
or VPN solution are you trying to plumb this through? :-)

-Z-

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
On Mon, 10 Oct 2005, Phil Dibowitz wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hey folks,
>
> So I've brought conserver to my next company - or at least I'm trying to. ;)
>
> I'm having a problem where the ssh connection from conserver to the
> cyclades times out, and the ssh client sees it happen, and thus
> conserver sees it happen - but the cyclade never sees this happen, and
> thus the sock_sshd for that port lives on and the cyclade can never connect.
>
> I know this is more of a cyclades question and less of a conserver
> question - but I figure this is probably one of the best places to ask
> if anyone's run into this.
>

FWIW, cyclades has a mailing list for users at cyusers@cyclades.com. It's
a very low traffic list and their support folks and lower middle
management do monitor the list and take problem reports seriously. This
list is not easy to find on their website, which I think is part of the
reason it's low traffic :)

HTH,

-n
--
-------------------------------------------
nathan hruby <nhruby@uga.edu>
uga enterprise information technology services
production systems support
-------------------------------------------
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

nathan r. hruby wrote:
> On Mon, 10 Oct 2005, Phil Dibowitz wrote:
>
>
>>-----BEGIN PGP SIGNED MESSAGE-----
>>Hash: SHA1
>>
>>Hey folks,
>>
>>So I've brought conserver to my next company - or at least I'm trying to. ;)
>>
>>I'm having a problem where the ssh connection from conserver to the
>>cyclades times out, and the ssh client sees it happen, and thus
>>conserver sees it happen - but the cyclade never sees this happen, and
>>thus the sock_sshd for that port lives on and the cyclade can never connect.
>>
>>I know this is more of a cyclades question and less of a conserver
>>question - but I figure this is probably one of the best places to ask
>>if anyone's run into this.
>>
>
>
> FWIW, cyclades has a mailing list for users at cyusers@cyclades.com. It's
> a very low traffic list and their support folks and lower middle
> management do monitor the list and take problem reports seriously. This
> list is not easy to find on their website, which I think is part of the
> reason it's low traffic :)

Sweet! Some extra hunting and I found the subscription page.

It is well hidden. ;)

Thanks, I'll post this over there, and I'll also call them today (no one
here is clear if we're still under warantee or not).

- --
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDS/hV9q0UmHR94IoRAnLLAJ421ZuPrl0LlARcS0a3YLZPYhSrbACeNE7K
NW7jIPoYe6ShN+HPUBz7GzM=
=GnqM
-----END PGP SIGNATURE-----
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
In a flurry of recycled electrons, Phil Dibowitz wrote:

> Sweet! Some extra hunting and I found the subscription page.
>
> It is well hidden. ;)

Share (& Enjoy)?

z!
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
nathan r. hruby wrote:
> On Mon, 10 Oct 2005, Phil Dibowitz wrote:
>
>
>>-----BEGIN PGP SIGNED MESSAGE-----
>>Hash: SHA1
>>
>>Hey folks,
>>
>>So I've brought conserver to my next company - or at least I'm trying to. ;)
>>
>>I'm having a problem where the ssh connection from conserver to the
>>cyclades times out, and the ssh client sees it happen, and thus
>>conserver sees it happen - but the cyclade never sees this happen, and
>>thus the sock_sshd for that port lives on and the cyclade can never connect.
>>
>>I know this is more of a cyclades question and less of a conserver
>>question - but I figure this is probably one of the best places to ask
>>if anyone's run into this.
>>
>
>
> FWIW, cyclades has a mailing list for users at cyusers@cyclades.com. It's
> a very low traffic list and their support folks and lower middle
> management do monitor the list and take problem reports seriously. This
> list is not easy to find on their website, which I think is part of the
> reason it's low traffic :)

Hmm. So I'm no longer convinced this is entirely a cyclades issue. But
it kinda depends on whether I'm reading the conserver logs right.

Conserver gets a console up:

[Wed Nov 30 16:40:56 2005] conserver (6459): [swx3.sys.adm1] console up

The ssh connection is somehow killed:

[Wed Nov 30 18:41:26 2005] conserver (6459): [swx3.sys.adm1] exit(255)

It then retries for an hour:

[Wed Nov 30 18:41:26 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization
[Wed Nov 30 18:41:35 2005] conserver (6459): [swx3.sys.adm1] exit(255)
[Wed Nov 30 18:41:35 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization
[Wed Nov 30 18:41:47 2005] conserver (6459): [swx3.sys.adm1] exit(255)
[Wed Nov 30 18:41:47 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization
[Wed Nov 30 18:42:00 2005] conserver (6459): [swx3.sys.adm1] exit(255)
[Wed Nov 30 18:42:00 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization


It then backs off for 30 seconds and brings it up successfully (does
this "up" means it's back to tring again, or it was able to successfully
bring the console up??):

[Wed Nov 30 19:30:01 2005] conserver (6459): ERROR: [swx3.sys.adm1]
initialization rate exceeded: forcing down
[Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization
[Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] console up

At which point it almost immediately dies again:

[Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] exit(255)
[Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization

Sometimes it'll stay up for a few seconds or so:

[Thu Dec 1 01:29:56 2005] conserver (6459): [swx3.sys.adm1] console up
[Thu Dec 1 01:30:00 2005] conserver (6459): [swx3.sys.adm1] exit(255)
[Thu Dec 1 01:30:00 2005] conserver (6459): [swx3.sys.adm1] automatic
reinitialization

I'm sending a report to cyclades as well - but I'd thought I'd run this
by the other experts here. ;)

--
Phil Dibowitz
P: 310-360-2330 C: 213-923-5115
Unix Admin, Ticketmaster.com
Re: conserver -> cyclade, ssh timeouts [ In reply to ]
On Thu, Dec 01, 2005 at 02:03:06PM -0800, Phil Dibowitz wrote:
> [Wed Nov 30 19:30:01 2005] conserver (6459): ERROR: [swx3.sys.adm1]
> initialization rate exceeded: forcing down
> [Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] automatic
> reinitialization
> [Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] console up
> At which point it almost immediately dies again:
>
> [Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] exit(255)
> [Wed Nov 30 19:30:56 2005] conserver (6459): [swx3.sys.adm1] automatic
> reinitialization

here's the deal with these messages. if a console is forced down (like
above), then when it succeeds in coming back up the 'console up' message
is generated. when it goes down, it retries, and all is well, no
'console up' is displayed (not sure why, but there was reasoning at the
time). that's the real difference between all your logs. in general,
conserver is forking off the command, seeing it exit with a -1 (255
since it's printing as unsigned) and retrying. they seem to stay
running about 4 seconds, before they exit. since this is a command and
conserver has no idea if the command is succeeding or not (as long as it
forks of it assumes so), it marks it as up (for about 4 seconds) and
catches it dying and retries. looks like it was respawning a bit
quicker above, so it hit the reinitialization rate...but that doesn't
really deviate from the general behavior.

it doesn't look like a conserver issue to me (it's doing what i think it
should, at least). it sounds like the ssh connection isn't being
dropped on the cyclades side (since it probably didn't get a FIN
packet, for whatever reason?) and keeping the port in use. conserver
can't tell what's really going on since the command could be doing
anything. i'd suggest having what conserver forks off do some logging
somewhere to see if ssh is getting a connection refused or failing to
authenticate or what...might help diagnose the issue. the fact the
fork, ssh, etc is taking so long makes me think it might not be just a
simple "connection refused".

Bryan
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users