Mailing List Archive

Long Running SSH Tunnel, Slowing Down
Hi all,

First time poster, long time OpenSSH user :)

The Situation:

Users within our net require access to a website (http/80) that is
being hosted on another, trusted net. Admins on this other trusted net
are not necessarily as trusting as we are, though they do provide a
ssh gateway. So, one fairly easy solution that was decided upon was to
simply allow users access to this website via a "permanent" SSH tunnel
(-f)

ssh -Nf -L 9100:webserver.trusted.com:80 user@sshgw.trusted.com

The Problem:

After a while -- say a few days/weeks -- of having this tunnel
established, transactions through this tunnel slow down to a crawl. To
the point where requests will typically timeout. Establishing a brand
new tunnel alongside the slowing tunnel seems to work fine. I don't
see anything particularly wrong with the endpoint systems other than
that sshd on the ssh gateway is consuming about 1.4 MB of virtual
memory. While this does not pose any threat to the machine per se, it
does seem a bit strange to me.

I'm curious as to what might be happening here, and what -- if
anything -- we can do about it. I've heard from a number of folks that
ssh tunnels for this purpose are a "bad idea" and that we might
consider a connectionless OpenVPN based solution. This is 100% fine,
however no one has been able to explain _why_ the tunnel slows down
which happens to be precisely what interests me. Can someone provide
me with any insights?

The ssh gateway system is CentOS 4.7 w/ OpenSSH 3.9p1 and the client
is Ubuntu 8.04 w/ OpenSSH 4.7p1.

Thanks in advance,
Tim
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
Tim,

I'm not an expert on ssh. IMHO this kind of behaviour is typical of a
memory leak or the application running out of resources in some way.
So this may be a track worth pursuing before having to delve in to the
more complex world of decoding packets. As you can establish a second
tunnel alongside the slow tunnel and this works it is not an OS
resource issue. It may be that instance of the ssh application (either
your client or the instance of the daemon to which you connect) has
run out of resources.

My first reaction would be to update the software at each end to the
latest revision. Second have a look for this specific issue with the
software being used. It may be a known issue and that a specific
parameter setting may resolve this.

If this makes no difference then try the more difficult decoding
route. I suspect that the software has its own commands/tools to help
with this. For example, Cisco has "debug" commands to help with
troubleshooting, such as debug ip ssh
(http://www.cisco.com/en/US/tech/tk583/tk617/technologies_tech_note09186a00800949e2.shtml#debugandshowcommands).
You may spot increasing number of errors as the link is used, or at
least be able to work out which packets belong to which type of ssh
traffic.



On 22 March 2010 17:32, Timothy O'Keefe <timothy.okeefe@gmail.com> wrote:
>
> Hi all,
>
> First time poster, long time OpenSSH user :)
>
> The Situation:
>
> Users within our net require access to a website (http/80) that is
> being hosted on another, trusted net. Admins on this other trusted net
> are not necessarily as trusting as we are, though they do provide a
> ssh gateway. So, one fairly easy solution that was decided upon was to
> simply allow users access to this website via a "permanent" SSH tunnel
> (-f)
>
> ssh -Nf -L 9100:webserver.trusted.com:80 user@sshgw.trusted.com
>
> The Problem:
>
> After a while -- say a few days/weeks -- of having this tunnel
> established, transactions through this tunnel slow down to a crawl. To
> the point where requests will typically timeout. Establishing a brand
> new tunnel alongside the slowing tunnel seems to work fine. I don't
> see anything particularly wrong with the endpoint systems other than
> that sshd on the ssh gateway is consuming about 1.4 MB of virtual
> memory. While this does not pose any threat to the machine per se, it
> does seem a bit strange to me.
>
> I'm curious as to what might be happening here, and what -- if
> anything -- we can do about it. I've heard from a number of folks that
> ssh tunnels for this purpose are a "bad idea" and that we might
> consider a connectionless OpenVPN based solution. This is 100% fine,
> however no one has been able to explain _why_ the tunnel slows down
> which happens to be precisely what interests me. Can someone provide
> me with any insights?
>
> The ssh gateway system is CentOS 4.7 w/ OpenSSH 3.9p1 and the client
> is Ubuntu 8.04 w/ OpenSSH 4.7p1.
>
> Thanks in advance,
> Tim
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
Hi Tim

Just sharing how I would investigate(not solve) the problem.

My first step would be to try and sniff at the serverside of the slow old tunnel. Do the same transaction on the new one that you got setup and compare the packet dump. Then I would analyze the diff and make an intelligent judgement. As is it your own network, you can even decrypt data and see the raw stuff.

Revert if in doubt

Cheers,
Deadbrain

Sent on my BlackBerry® from Vodafone
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
On Tue, Mar 23, 2010 at 09:44:22AM +0000, John Morrison wrote:
> Tim,
>
> I'm not an expert on ssh. IMHO this kind of behaviour is typical of a
> memory leak or the application running out of resources in some way.

Memory leak sounds very feasible here but do check what other processes
are running in general. If you leave a process running for days, also
check what _other_ stuff is running around it. (Andy, who has just had
to reboot his home wifi router which is on 24/7 because it dies about
once a week and loses DNS - a reset fixes it).

> So this may be a track worth pursuing before having to delve in to the
> more complex world of decoding packets. As you can establish a second
> tunnel alongside the slow tunnel and this works it is not an OS
> resource issue. It may be that instance of the ssh application (either
> your client or the instance of the daemon to which you connect) has
> run out of resources.
>

Far down in your reply, you mention ssh using 1.4MiB of virtual memory -
is this the figure from top or some such or do you mean that the machine
is also hitting swap?

> My first reaction would be to update the software at each end to the
> latest revision. Second have a look for this specific issue with the
> software being used. It may be a known issue and that a specific
> parameter setting may resolve this.
>

You have access to your end only to do this, I presume, and I note that
you're using 8.04 which is an LTS release: if it's a server, then it's
still within the ?? five years ?? support.

The next iteration of Ubuntu 10.04 is also a long term supported release
and the beta is out now - the rest is due in April. Try building a test
machine to see whether issues are resolved / there are other factors
which may make you consider an upgrade in the future?

> If this makes no difference then try the more difficult decoding
> route. I suspect that the software has its own commands/tools to help
> with this. For example, Cisco has "debug" commands to help with
> troubleshooting, such as debug ip ssh
> (http://www.cisco.com/en/US/tech/tk583/tk617/technologies_tech_note09186a00800949e2.shtml#debugandshowcommands).
> You may spot increasing number of errors as the link is used, or at
> least be able to work out which packets belong to which type of ssh
> traffic.
>

> > After a while -- say a few days/weeks -- of having this tunnel
> > established, transactions through this tunnel slow down to a crawl. To
> > the point where requests will typically timeout. Establishing a brand
> > new tunnel alongside the slowing tunnel seems to work fine. I don't
> > see anything particularly wrong with the endpoint systems other than
> > that sshd on the ssh gateway is consuming about 1.4 MB of virtual
> > memory. While this does not pose any threat to the machine per se, it
> > does seem a bit strange to me.
> >
> > The ssh gateway system is CentOS 4.7 w/ OpenSSH 3.9p1 and the client
> > is Ubuntu 8.04 w/ OpenSSH 4.7p1.
> >

The gateway sysadmin might want to consider CentOS 4.8 as a minimum /
updating from EPEL / RPMForge. OpenSSH 3.9 is desperately old :(

> > Thanks in advance,
> > Tim

Hope this helps,

AndyC
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
On Tue, Mar 23, 2010 at 6:32 AM, Timothy O'Keefe
<timothy.okeefe@gmail.com> wrote:
> Hi all,
>
> The Problem:
>
> After a while -- say a few days/weeks -- of having this tunnel
> established, transactions through this tunnel slow down to a crawl. To
> the point where requests will typically timeout. Establishing a brand
> new tunnel alongside the slowing tunnel seems to work fine. I don't
> see anything particularly wrong with the endpoint systems other than
> that sshd on the ssh gateway is consuming about 1.4 MB of virtual
> memory. While this does not pose any threat to the machine per se, it
> does seem a bit strange to me.

The problem is essentially that TCP tunnels over a TCP transport are a
bad idea. Eventually you get a cascading effect that will slow things
down substantially to the point where it becomes essentially useless
and you'll have to rebuild the tunnels.

This is why most VPN and tunnelling solutions work over UDP or their
own IP protocol rather than across TCP.

For practical purposes, the only way to really take care of the issue
is to run scripts that will tear down and recreate the tunnel at set
times. The other alternative is to use a tunnelling method that uses
UDP or some other protocol.


> I'm curious as to what might be happening here, and what -- if
> anything -- we can do about it. I've heard from a number of folks that
> ssh tunnels for this purpose are a "bad idea" and that we might
> consider a connectionless OpenVPN based solution. This is 100% fine,
> however no one has been able to explain _why_ the tunnel slows down
> which happens to be precisely what interests me. Can someone provide
> me with any insights?

For a good explaination of why this happens and how to resolve it from
a network point of view, you can see the following PDF.

http://docs.google.com/viewer?a=v&q=cache:TqsO7Bi6-1AJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.21.7007%26rep%3Drep1%26type%3Dpdf+TCP+tunnels+over+TCP+networks+performance&hl=en

Hope that helps.
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
On Thu, Mar 25, 2010 at 11:52 AM, Stephen Cropp <korgan@gmail.com> wrote:
>
> For a good explaination of why this happens and how to resolve it from
> a network point of view, you can see the following PDF.
>
> http://docs.google.com/viewer?a=v&q=cache:TqsO7Bi6-1AJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.21.7007%26rep%3Drep1%26type%3Dpdf+TCP+tunnels+over+TCP+networks+performance&hl=en
>
> Hope that helps.
>

Ugggh... Just realised I sent through the wrong article.

A far more useful link regarding TCP over TCP performance is available at
http://sites.inka.de/~W1011/devel/tcp-tcp.html

Sorry for the misdirection.
Re: Long Running SSH Tunnel, Slowing Down [ In reply to ]
Thanks so far to everyone who has responded to my question. At a first
timer on this mailing list, I wasn't entirely sure what to expect.
This is amazingly helpful.

Stephen, thank you for that very helpful and interesting article.

On Wed, Mar 24, 2010 at 7:33 PM, Stephen Cropp <korgan@gmail.com> wrote:
> On Thu, Mar 25, 2010 at 11:52 AM, Stephen Cropp <korgan@gmail.com> wrote:
>>
>> For a good explaination of why this happens and how to resolve it from
>> a network point of view, you can see the following PDF.
>>
>> http://docs.google.com/viewer?a=v&q=cache:TqsO7Bi6-1AJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.21.7007%26rep%3Drep1%26type%3Dpdf+TCP+tunnels+over+TCP+networks+performance&hl=en
>>
>> Hope that helps.
>>
>
> Ugggh... Just realised I sent through the wrong article.
>
> A far more useful link regarding TCP over TCP performance is available at
> http://sites.inka.de/~W1011/devel/tcp-tcp.html
>
> Sorry for the misdirection.
>