Mailing List Archive

[Bug 967] Command only sessions hangs on target system.
http://bugzilla.mindrot.org/show_bug.cgi?id=967

Summary: Command only sessions hangs on target system.
Product: Portable OpenSSH
Version: 3.8.1p1
Platform: All
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: sshd
AssignedTo: openssh-bugs@mindrot.org
ReportedBy: dimitrij@schlund.de


on Debian Linux ( sarge ) with kernel 2.6.9 hangs a non privileged thread from
sshd if esecuted command returns. Not every request hangs, but a lot:
4683 ? Ss 0:00 /usr/sbin/sshd
6295 ? Ss 0:00 \_ sshd: root@notty
6297 ? Zs 0:00 | \_ [check_dpt] <defunct>
8000 ? Ss 0:00 \_ sshd: root@notty
8002 ? Zs 0:00 | \_ [check_dpt] <defunct>
8048 ? Ss 0:00 \_ sshd: root@notty
8050 ? Zs 0:00 | \_ [check_dpt] <defunct>
8063 ? Ss 0:00 \_ sshd: root@notty
8065 ? Zs 0:00 | \_ [check_dpt] <defunct>
8078 ? Ss 0:00 \_ sshd: root@notty
8080 ? Zs 0:00 | \_ [check_dpt] <defunct>
8098 ? Ss 0:00 \_ sshd: root@notty
8100 ? Zs 0:00 \_ [check_dpt] <defunct>

Dimitrij



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-06 10:22 -------
Is this built with PAM? Does the problem occur with 3.9p1?

There was a bug that was fixed in 3.9p1 relating the the handling of SIGCHLD in
the PAM code which could possibly be the cause of this.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 18:43 -------
It is standard debian sarge ( testing ) sshd an was build witch PAM. It is 3.4p1.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From djm@mindrot.org 2005-01-10 18:48 -------
3.4p1 is 2.5 years old. Please try to reproduce this problem with 3.9p1, or you
can take the bug up with your OS vendor if they refuse to provide a non-ancient
version.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 18:51 -------
why does 'UsePAM no' in sshd_config not solve this problem?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-10 19:12 -------
The UsePAM sshd_config(5) directive was introduced in 3.7p1. Prior to that, it
was a compile-time directive only (configure --with-pam).

That said, the bug I was referring to was introduced around 3.8ish, so it can't
be the cause of your problem.

Can you reproduce the problem with 3.9p1? If not, please close this bug and
report it to Debian.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 19:54 -------
This problem still exists with 3.9p1-1 from debian/experimental too.

strace:
Process 11107 attached - interrupt to quit
futex(0xb7e573cc, FUTEX_WAIT, 2, NULL^X <unfinished ...>

gdb:
balancedev2:~# gdb -p 11107
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux".
Attaching to process 11107
Using host libthread_db library "/lib/tls/libthread_db.so.1".

warning: could not load vsyscall page because no executable was specified

warning: try using the "file" command first
Reading symbols from /usr/sbin/sshd...(no debugging symbols found)...done.
Reading symbols from /lib/libwrap.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libwrap.so.0
Reading symbols from /lib/libpam.so.0...
(no debugging symbols found)...done.
Loaded symbols for /lib/libpam.so.0
Reading symbols from /lib/tls/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/tls/libresolv.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libresolv.so.2
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libutil.so.1...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libutil.so.1
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/libnsl.so.1...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libnsl.so.1
Reading symbols from /lib/tls/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libpthread.so.0...
(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
[New Thread -1210984832 (LWP 11107)]
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/tls/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_compat.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/tls/libnss_compat.so.2
Reading symbols from /lib/tls/libnss_nis.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libnss_nis.so.2
Reading symbols from /lib/tls/libnss_files.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /lib/tls/libnss_dns.so.2...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libnss_dns.so.2
0xb7e077a1 in pthread_setcanceltype ()
from /lib/tls/libc.so.6
(gdb) bt
#0 0xb7e077a1 in pthread_setcanceltype () from /lib/tls/libc.so.6
#1 0x00000001 in ?? ()
#2 0xb7e55fcc in ?? () from /lib/tls/libc.so.6
#3 0x00000001 in ?? ()
#4 0xb7df7523 in setlogmask () from /lib/tls/libc.so.6
#5 0xbfffdd74 in ?? ()
#6 0xb7df74a0 in setlogmask () from /lib/tls/libc.so.6
#7 0x00000000 in ?? ()
#8 0x00000001 in ?? ()
#9 0x00000000 in ?? ()
#10 0xbfffe174 in ?? ()
#11 0xbfffdd74 in ?? ()
#12 0x00000007 in ?? ()
#13 0xbfffe58c in ?? ()
#14 0x0806f3ac in error ()
#15 0x0806f150 in error ()
#16 0x08053301 in ?? ()
#17 0x0807ff45 in _IO_stdin_used ()
#18 0xb7e46d5b in in6addr_loopback () from /lib/tls/libc.so.6
#19 0xbfffea6c in ?? ()
#20 0xbfffe63c in ?? ()
#21 0x00000009 in ?? ()
#22 0x00000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
#23 0x00000008 in ?? ()
#24 <signal handler called>
#25 0xb7dfba1c in send () from /lib/tls/libc.so.6
#26 0xb7df6e10 in vsyslog () from /lib/tls/libc.so.6
#27 0xb7df6aaf in syslog () from /lib/tls/libc.so.6
#28 0x0806f3c1 in error ()
#29 0x0806f150 in error ()
#30 0x08054e27 in ?? ()
#31 0x080800d4 in _IO_stdin_used ()
#32 0x00000005 in ?? ()
#33 0xbffff358 in ?? ()
#34 0x08053d5f in ?? ()
#35 0x0808f334 in stdin ()
#36 0x080532e0 in ?? ()
#37 0x00000000 in ?? ()
#38 0x00000000 in ?? ()
#39 0x08092c20 in stdin ()
#40 0x00000000 in ?? ()
#41 0xbffff358 in ?? ()
#42 0x00000000 in ?? ()
#43 0x00000000 in ?? ()
#44 0x00000000 in ?? ()
#45 0x00000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
#46 0x00002b65 in ?? ()
#47 0x08092c20 in stdin ()
#48 0xbffff380 in ?? ()
#49 0xbffff398 in ?? ()
#50 0x0805793a in ?? ()
#51 0x00002b65 in ?? ()
#52 0x00000005 in ?? ()
#53 0x00000005 in ?? ()
#54 0x00000007 in ?? ()
#55 0x080965c0 in ?? ()
#56 0x0809bc62 in ?? ()
#57 0x00000006 in ?? ()
#58 0x00000007 in ?? ()
#59 0x00000004 in ?? ()
#60 0x00000005 in ?? ()
#61 0xbffff3a8 in ?? ()
#62 0x080965c0 in ?? ()
#63 0x08092c20 in stdin ()
#64 0x00000000 in ?? ()
#65 0xbffff3b8 in ?? ()
#66 0x08057dcc in ?? ()
#67 0x08092c20 in stdin ()
#68 0x080965c0 in ?? ()
---Type <return> to continue, or q <return> to quit---
#69 0xbffff3c4 in ?? ()
#70 0x080723e9 in error ()
Previous frame inner to this frame (corrupt stack?)

Dimi



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-10 20:07 -------
Hrm...
> Reading symbols from /lib/tls/libpthread.so.0...

It looks like Debian built sshd with the pthread hack, which is unsupported (and
opens a whole other can of worms).

Can you reproduce it with 3.9p1 with "UsePAM no" in sshd_config?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 20:41 -------
Update:

only ssh -1 is brocken ( many command only keys are ssh1 ) ssh -2 is Ok.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-10 20:53 -------
That's interesting. Could you please attach (note: please use "create
attachment", don't paste into the text field) your sshd_config.

Also, could you please try running the server in debug mode ("/path/to/sshd
-ddde -p 2022" then connect to port 2022) and if you can reproduce the problem
with it in debug mode, please attach that log separately.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 21:03 -------
Created an attachment (id=761)
--> (http://bugzilla.mindrot.org/attachment.cgi?id=761&action=view)
My sshd_config




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 21:04 -------
this prblem can't be reproduced in debug mode. My sshd_config is now in attacment.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967

dtucker@zip.com.au changed:

What |Removed |Added
----------------------------------------------------------------------------
Version|3.8.1p1 |3.9p1



------- Additional Comments From dtucker@zip.com.au 2005-01-10 21:57 -------
OK, if it can't be provoked in debug mode then the other option is to kick the
debugging up to DEBUG3 and get the messages from syslog.

I see that you already have the LogLevel at DEBUG, which may have some clues.
Could you grep a failing session out by pid and attach that?

Also, if you kill one of the "sshd: root@notty" processes does the "defunct"
process vanish too? I've seen defunct processes on Linux wedge unkillably,
usually when they get straced.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 22:07 -------
Created an attachment (id=762)
--> (http://bugzilla.mindrot.org/attachment.cgi?id=762&action=view)
Logs from parent




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 22:09 -------
Yes, if i kill the parent process, then the child zombie dies too. SINCHLD
handler buggy?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-10 22:30 -------
It could be a problem with the SIGCHLD handler but I have a feeling it's some
kind of race.

Do you see the problem running, say, "sleep 1" compared to "true" via ssh ?

I'm guessing that if the command exits *really* fast then the SIGCHLD might be
delivered before the handler is set up. If that's true then I would expect the
sleep's to be problem free but the true's to exhibit the problem.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-10 23:07 -------
Bingo! With "sleep 1" sshd works Ok. first commando with true make a zombie.
Cause this issue i never got problem with -ddd.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-10 23:26 -------
Created an attachment (id=763)
--> (http://bugzilla.mindrot.org/attachment.cgi?id=763&action=view)
check for pending child after setting up handler

Here's a patch to try, against 3.9p1. It will (assuming I got it right :-)
check for a pending child and set the appropriate flags immediately after
setting up the SIGCHLD handler.

I would guess that the reason it doesn't happen with debugging on is that the
debugging changes the timing enough for it miss the window.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967

dtucker@zip.com.au changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #763 is|0 |1
obsolete| |



------- Additional Comments From dtucker@zip.com.au 2005-01-10 23:47 -------
Created an attachment (id=764)
--> (http://bugzilla.mindrot.org/attachment.cgi?id=764&action=view)
check for pending child after setting up handler

Oops, looks like I didn't get it right after all... Try this one instead.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-11 06:39 -------
Hi,

i tried to patch 3.8p1 with second patch. Same problem withc /bin/true as command.

I will try 3.9p1 tommorow, but interesting is it for 3.8p1 because it is default
in debian stable/testing. So i think debian maintainer mus backport your patch too.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-11 09:39 -------
If the patch didn't help then the problem is probably elsewhere. Could you
please bump LogLevel to DEBUG3 and grep a failing session out of syslog as
mentioned earlier?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-11 18:36 -------
Created an attachment (id=767)
--> (http://bugzilla.mindrot.org/attachment.cgi?id=767&action=view)
hangs with DEBUG3




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dimitrij@schlund.de 2005-01-11 18:37 -------
Hi,

I'v patched 3.8p1 with second patch and get on client everytimes an error now:
Received disconnect from 10.0.0.3: wait: Bad file descriptor



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-11 19:00 -------
So I understand you: with patch #764 you get that error but it doesn't hang? If
so then we're probably on the the right track, but my patch isn't quite right.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 967] Command only sessions hangs on target system. [ In reply to ]
http://bugzilla.mindrot.org/show_bug.cgi?id=967





------- Additional Comments From dtucker@zip.com.au 2005-01-11 19:13 -------
Please double-check that your're using the patch in attachment #764? I saw that
bogus disconnect error with the older patch but I can't reproduce it with #764.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

1 2  View All