Mailing List Archive

[Bug 3598] Dead lock of sshd and Defunct of sshd
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

Darren Tucker <dtucker@dtucker.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |dtucker@dtucker.net

--- Comment #1 from Darren Tucker <dtucker@dtucker.net> ---
Created attachment 3710
--> https://bugzilla.mindrot.org/attachment.cgi?id=3710&action=edit
Block signals while sysloggin

You could try blocking signals while it's in syslog.

That said, if the problem is that it's blocking in syslog indefinitely
in the first call (and if it's timing out after 90s, that seems likely)
you'll still have sshds blocked in syslog.

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #2 from mzhan017 <mark.zhang@nokia-sbell.com> ---
Darren,
Yes, you're correct.
We could be blocked in the first syslog call, even without the dead
lock.

But still could face the issue of the number of process/memory usage
kept increasing.

Is it possible to defense such situation of blocked syslog? That may
could make sure that we could still login the system stable by ssh.


Thanks,
Mark

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |djm@mindrot.org

--- Comment #3 from Damien Miller <djm@mindrot.org> ---
IMO the problem is fundamentally that we're doing operations in a
signal handler that are unsafe on some platforms. We should probably
make sigdie() a noop anywhere snprintf()+syslog() are not guaranteed to
be safe, which AFAIK is everything other than OpenBSD.

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #4 from Darren Tucker <dtucker@dtucker.net> ---
(In reply to Damien Miller from comment #3)
> IMO the problem is fundamentally that we're doing operations in a
> signal handler that are unsafe on some platforms.

That's true but I don't think it's the problem here, unless they just
happen to be hitting a signal race at exactly 90s.

> We should probably
> make sigdie() a noop anywhere snprintf()+syslog() are not guaranteed
> to be safe, which AFAIK is everything other than OpenBSD.

Now that privsep is mandatory we could move the LoginGraceTime signal
handler into the privsep child and just have it _exit(somenumber), then
have the monitor read that exit code in its normal event loop and log
from there. I have most of the code for the monitor side written as
part of another thing.

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #5 from Damien Miller <djm@mindrot.org> ---
nerfing sigdie would mean that we lose the following log messages:

auth-pam.c: sigdie("PAM: authentication thread exited
unexpectedly");
auth-pam.c: sigdie("PAM: authentication thread exited
uncleanly");
sshd.c: sigdie("Timeout before authentication for %s port %d",

I was about to suggest what Darren said re arranging for the process to
exit with a magic value and moving the logging to the parent, but I see
that he beat me to it :)

OTOH I don't love the idea of moving the grace alarm to the privsep
child, since it's intended not to be trustworthy. Other options include
implementing LoginGraceTime in the monitor mainloop or having the
listener do the logging (AFAIK it's still around at this point for
MaxStartups tracking)

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #6 from Darren Tucker <dtucker@dtucker.net> ---
(In reply to Damien Miller from comment #5)
> Other options
> include implementing LoginGraceTime in the monitor mainloop

That's non-trivial since some of the potential timeouts are prior to
the monitor mainloop eg kex_exchange_identification().

> or having the listener do the logging (AFAIK it's still around at this
> point for MaxStartups tracking)

That should be doable with a bit of plumbing, the only caveat I can
think of is that the timeout log messages will come from a pid not
directly associated with the connection.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3711|ok?(dtucker@dtucker.net) |
Flags| |
Attachment #3711|0 |1
is obsolete| |
Attachment #3714| |ok?(dtucker@dtucker.net)
Flags| |

--- Comment #8 from Damien Miller <djm@mindrot.org> ---
Created attachment 3714
--> https://bugzilla.mindrot.org/attachment.cgi?id=3714&action=edit
Fixed diff

Revised diff that fixes a couple of logic errors and simplifies some
code.

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3714|ok?(dtucker@dtucker.net) |
Flags| |

--- Comment #9 from Damien Miller <djm@mindrot.org> ---
Comment on attachment 3714
--> https://bugzilla.mindrot.org/attachment.cgi?id=3714
Fixed diff

actually, this diff has a big problem too: because it tracks all child
processes in the same structure, and because the tracking logic is
incorrect, it limits the _total_ number of concurrent sessions to
MaxSessions and not just the number of _authenticating_ sessions :(

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3714|0 |1
is obsolete| |
Attachment #3715| |ok?(dtucker@dtucker.net)
Flags| |

--- Comment #10 from Damien Miller <djm@mindrot.org> ---
Created attachment 3715
--> https://bugzilla.mindrot.org/attachment.cgi?id=3715&action=edit
Really fixed diff

This should fix the problems in the previous diff and simplifies things
a little more.

Child processes now signal that authentication was successful back to
the listener, so it can stop tracking them. They do this by sending
another char over the startup_pipe, in addition to the first one they
send to signal they have received their rexec state. When the listener
is so notified, it stops caring about the subprocess and frees up its
slot so it doesn't count against MaxStartups.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #11 from Darren Tucker <dtucker@dtucker.net> ---
Comment on attachment 3715
--> https://bugzilla.mindrot.org/attachment.cgi?id=3715
Really fixed diff

>From 9f895491cc6a671fc49b9cda78edfe3801b0af74 Mon Sep 17 00:00:00 2001
>From: Damien Miller <djm@mindrot.org>
>Date: Fri, 4 Aug 2023 14:51:03 +1000
>Subject: [PATCH] logging of monitor process exits in listener

This seems like a bit too large of a change to go in so close to a
release?

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3598] Dead lock of sshd and Defunct of sshd [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3598

--- Comment #12 from Damien Miller <djm@mindrot.org> ---
> This seems like a bit too large of a change to go in so close to a release?

oh sure, not proposing this for 9.4 but afterwards

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs