Mailing List Archive

[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

likan_999.student@sina.com changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|ssh startup time is linear |ssh startup time on OSX is
|to _SC_OPEN_MAX |linear to _SC_OPEN_MAX

--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Darren Tucker <dtucker@dtucker.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |dtucker@dtucker.net
Attachment #3305|application/octet-stream |text/plain
mime type| |
Attachment #3305|0 |1
is patch| |

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Blocks| |2988
CC| |djm@mindrot.org


Referenced Bugs:

https://bugzilla.mindrot.org/show_bug.cgi?id=2988
[Bug 2988] Tracking bug for 8.1 release
--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3305|0 |1
is obsolete| |
Status|NEW |ASSIGNED
Assignee|unassigned-bugs@mindrot.org |djm@mindrot.org
Attachment #3306| |ok?(dtucker@dtucker.net)
Flags| |

--- Comment #1 from Damien Miller <djm@mindrot.org> ---
Created attachment 3306
--> https://bugzilla.mindrot.org/attachment.cgi?id=3306&action=edit
refactor, add fallback

Thanks for the patch.

I think there were two problems in your original implementation:

1) if the fd table size changed between the size-query proc_pidinfo()
and the call to obtain the fd map then the code would just bail out
without closing anything

2) likewise, for other errors (e.g. malloc failure) - it's better to
fall back to brute-force closing all fds than doing nothing.

I've refactored the closefrom implementation to do both these, with a
closefrom_fallback() doing the brute-force closure, and sharing this
with the /proc/pid/fd strategy. I'm not in from of an OSX machine right
now, so please test.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

--- Comment #2 from Darren Tucker <dtucker@dtucker.net> ---
Comment on attachment 3306
--> https://bugzilla.mindrot.org/attachment.cgi?id=3306
refactor, add fallback

>+ AC_CHECK_HEADERS([libproc.h], [
>+ AC_CHECK_DECLS([proc_pidinfo], [], [], [#include <libproc.h>])

this only checks that the function is declared, not that it's actually
present. conceivably this could activate this code but fail at link
time.

otherwise looks ok.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

--- Comment #3 from likan_999.student@sina.com ---
Thanks for reviewing it. Good catch, the error case should fallback to
original implementation. For detecting table size change, I was
previously thinking the function will be called in single thread
situation (such as in ssh, at beginning) thus the fd table is not
possible to change.

If we want to make it robust enough to handle the case where another
thread opens/closes the files between the two calls to proc_fdinfo (do
we have such case for openssh codebase?), it is incorrect the way the
refactored patch detects the table change. Here is the source code of
proc_pidinfo:
https://opensource.apple.com/source/xnu/xnu-2050.7.9/bsd/kern/proc_info.c

From the code (function proc_pidfdlist), we can see if bufsize is not
large enough to hold all file descriptors, it only returns as many as
possible to be held in the buffer, without reporting any error, thus
won't be detected by comparing return value with NULL (actually, it is
not a good style to compare buffer size with NULL).

If we really want to detect the change of file descriptors, we need to
compare with the previous buffer size, if it is the same as previous
value, then it is possible new ones are added, thus we need to retry.
But IMHO, is this really needed? If yes, we also need to retry in the
fallback method.

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

--- Comment #4 from likan_999.student@sina.com ---
Comment on attachment 3306
--> https://bugzilla.mindrot.org/attachment.cgi?id=3306
refactor, add fallback

Also, the line 129, after closefrom_fallback is called, it should
return, otherwise fdinfo_buf==NULL but dereferenced at line 132, which
will crash the process.

After line 123, I guess you miss a 'continue'? Otherwise it won't ever
retry, right?

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3306|ok?(dtucker@dtucker.net) |
Flags| |
Attachment #3306|0 |1
is obsolete| |

--- Comment #5 from Damien Miller <djm@mindrot.org> ---
Created attachment 3307
--> https://bugzilla.mindrot.org/attachment.cgi?id=3307&action=edit
attempt #2

revised patch

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

--- Comment #6 from likan_999.student@sina.com ---
Comment on attachment 3307
--> https://bugzilla.mindrot.org/attachment.cgi?id=3307
attempt #2

Online 129, you miss a semicolon after return, which makes it fail to
compile.

Also, we need to have "got < need" instead of "got <= need". Firstly,
got == need means the buffer given to proc_pidinfo is full, so it is
possible there are more file descriptors added, thus we need to retry
in this case. Secondly, proc_pidinfo, if given NULL buffer, will return
20 more entries than the actual file table size, so in normal cases,
the 'got < need' code path will be executed.

In case got == need, ideally we should close file descriptors in the
buffer returned, before we retry. In this way, the next retry will have
far less number of elements to work on. Otherwise, if another thread
keep adding new file descriptors, the next retry will have more file
descriptors than previous retry, causing the proc_pidinfo syscall as
well as malloc to be even slower, making it even lower chance that
there is no file table changed between two proc_pidinfo syscalls.

Again, I'd like to ask, in which case this closefrom is called while
another thread is opening new file descriptors? The retry logic is
complicated now, do we really need to introduce such a complicated
logic to handle such a corner case? My understanding is, neither
F_CLOSEM nor fallback implementation guarantees anything if there is
another thread keep opening new file descriptors, therefore this
closefrom is just on best effort basis.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

--- Comment #7 from Darren Tucker <dtucker@dtucker.net> ---
I don't think the retries are worth it in this context either. I'm not
sure it's even possible for an FD to change under it, but if so this
could be handled by the fallback.

Also noticed this warning this because "got" gets promoted:

bsd-closefrom.c:132:16: warning: comparison of integers of different
signs:
'int' and 'unsigned long' [-Wsign-compare]
for (i = 0; i < got / PROC_PIDLISTFD_SIZE; i++) {
~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
$ grep -r PROC_PIDLISTFD_SIZE /usr/include
/usr/include//sys/proc_info.h:#define PROC_PIDLISTFD_SIZE
(sizeof(struct proc_fdinfo))

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Damien Miller <djm@mindrot.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3307|0 |1
is obsolete| |
Attachment #3309| |ok?(dtucker@dtucker.net)
Flags| |

--- Comment #8 from Damien Miller <djm@mindrot.org> ---
Created attachment 3309
--> https://bugzilla.mindrot.org/attachment.cgi?id=3309&action=edit
revised again

ok, I've got my mac running again. Here's a revision that removes the
retry loop and just goes straight to the fallback when anything goes
wrong. It seems to close the right stuff when inspected with dtruss.

--
You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs
[Bug 3049] ssh startup time on OSX is linear to _SC_OPEN_MAX [ In reply to ]
https://bugzilla.mindrot.org/show_bug.cgi?id=3049

Darren Tucker <dtucker@dtucker.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #3309|ok?(dtucker@dtucker.net) |ok+
Flags| |

--
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs