Mailing List Archive

patch: minimize calls to close() in spawn by calling poll()
Hello,

Posting this patch to the mailing list per jeroen's instructions.

# Problem Statement
When running GPGME within Docker, the maximum number of FDs may be set
higher than a sane and reasonable values for its specific use-case, causing
many close calls to be issued, introducing a major slowdown in performance
of any calls to the `gpg` binary.

# Proposed Solution
This PR attempts to mitigate the problem by calling `poll` on the entire
range of file descriptors before issuing `close` calls to each file
descriptor. This limits the number of close calls made to the number of
active file descriptors in the current process. It also includes a minor
FIXME.

## Performance before and after

This graph shows the performance of an application signing and then
decrypting a few kilobytes of data.
https://user-images.githubusercontent.com/772810/48650469-0ce7d700-e9ab-11e8-84fd-5e3f6b619220.png

# The patch

https://patch-diff.githubusercontent.com/raw/gpg/gpgme/pull/1.patch
Re: patch: minimize calls to close() in spawn by calling poll() [ In reply to ]
On Sat, 17 Nov 2018 01:57, dctucker@github.com said:

> When running GPGME within Docker, the maximum number of FDs may be set
> higher than a sane and reasonable values for its specific use-case, causing
> many close calls to be issued, introducing a major slowdown in performance

Have you investigated why get_max_fd does not work for you. On Linux
that function is is used to detect the highest used file descriptor by
reading /proc/self/fd/. Or is your application really using that many
fds?

Is is guaranteed that poll(3) does not allocate memory or other things
which may take a lock?

Minor nitpicks: The code assumes that poll is available. It should also
not used TRACE because that may allocate memory and thus can lead to a
deadlock.


Shalom-Salam,

Werner

--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Re: patch: minimize calls to close() in spawn by calling poll() [ In reply to ]
* Werner Koch <wk@gnupg.org>, 2018-11-18, 12:42:
>On Sat, 17 Nov 2018 01:57, dctucker@github.com said:
>>When running GPGME within Docker, the maximum number of FDs may be set
>>higher than a sane and reasonable values for its specific use-case,
>>causing many close calls to be issued, introducing a major slowdown in
>>performance
>
>Have you investigated why get_max_fd does not work for you. On Linux
>that function is is used to detect the highest used file descriptor by
>reading /proc/self/fd/. Or is your application really using that many
>fds?

FWIW, the parent of Casey's commit is fff2049c1bc7 (from 2014...), which
didn't have this /proc/self/fd thing. Huh?

>Is is guaranteed that poll(3) does not allocate memory or other things
>which may take a lock?

Yes, poll() is async-singal-safe.

--
Jakub Wilk

_______________________________________________
Gnupg-devel mailing list
Gnupg-devel@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-devel
Re: patch: minimize calls to close() in spawn by calling poll() [ In reply to ]
> FWIW, the parent of Casey's commit is fff2049c1bc7 (from 2014...), which
> didn't have this /proc/self/fd thing. Huh?

Thank you for pointing this out. Since the default branch on the
github mirror is set to `bjk/master`, I erroneously assumed it was the
latest and greatest. I'll target `master` in the next patch.

> Have you investigated why get_max_fd does not work for you. On Linux
> that function is is used to detect the highest used file descriptor by
> reading /proc/self/fd/. Or is your application really using that many
> fds?

In our application, we are using an older version of GPGME, version
1.9 by way of ueno/ruby-gpgme. I will attempt to backport the newer
`get_max_fds` code to improve performance. We did verify that the
application is not generating a high number of file descriptors by
running perf against GPGME tests, ruby-gpgme tests, and our
applicaton's tests, and seeing a similarly high number of close calls
running.

> Minor nitpicks: The code assumes that poll is available. It should also
> not used TRACE because that may allocate memory and thus can lead to a
> deadlock.

poll() is available since Linux 2.1.23; what is the preferred macro to
determine whether to use poll(); should we include linux/version.h if
available and check `LINUX_VERSION_CODE >= KERNEL_VERSION(2,1,23)`?
I can remove the TRACE.

_______________________________________________
Gnupg-devel mailing list
Gnupg-devel@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-devel
Re: patch: minimize calls to close() in spawn by calling poll() [ In reply to ]
On Sun, 18 Nov 2018 20:49, dctucker@github.com said:

> Thank you for pointing this out. Since the default branch on the
> github mirror is set to `bjk/master`, I erroneously assumed it was the
> latest and greatest. I'll target `master` in the next patch.

Please do no use an arbitrary mirror but use git.gnupg.org or our own
mirror at dev.gnupg.org.

> poll() is available since Linux 2.1.23; what is the preferred macro to

GPGME is not Linux specific but needs to be build on all POSIX platforms
as well as Windows Vista and newer.


Shalom-Salam,

Werner

--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.