Mailing List Archive

EPOLLEXCLUSIVE and selectors
Hello, I'd like to bring up https://bugs.python.org/issue44951 /
https://github.com/python/cpython/pull/27819 to the mailing list's
consideration as it has idled for a bit. I would appreciate some
authoritative feedback on which API design choice is best. I'll also
recap the PR quickly:

Motivation:
- There is community demand for EPOLLEXCLUSIVE in Python (see blog
posts in BPO issue)
- You can't do this with the existing stdlib code as
_BaseSelectorImpl.register() raises ValueError on non-READ/WRITE
constants. (This is why _PollLikeSelector.register() has two
variables, events and poller_events)
- It's not an invasive change. The Python API doesn't change much or
at all and the kernel's EPOLLEXCLUSIVE behavior was carefully designed
to be backwards compatible with using exclusive and non-exclusive
watches on the same file descriptor.

I've got two approaches to this. The first extends the EpollSelector
class with a property to toggle setting the EPOLLEXCLUSIVE on newly
registered file descriptors:

https://github.com/dgilman/cpython/commit/43174df5bd7a78eedf0692ebbe63a9b943248a74

The second introduces an entirely new sibling class,
EpollExclusiveSelector, that unconditionally sets it on registration:

https://github.com/dgilman/cpython/commit/554a5bf9c16b6bd82ce47b2d0dd1833f2bdd31cb

The first one was my first attempt but I am leaning towards the
second. It doesn't require any new API surface area. It also gets
integrated into the DefaultSelector logic, and even if that shouldn't
happen it's still easy to swap out your existing selector class for
the EpollExclusiveSelector class.

Enabling EPOLLEXCLUSIVE by default:

From the research I did last year my understanding is that
EPOLLEXCLUSIVE is never a performance drawback on Linux, and in the
case of a server that gets fast traffic it's a dramatic improvement.
However, I have not done my own benchmarking (with say, gunicorn,
which uses the stdlib's selectors module).

Note that EPOLLEXCLUSIVE does have one kernel API break: you can no
longer use EPOLL_CTL_MOD on an exclusive file descriptor. Python uses
the _MOD flag under the hood to implement epoll.modify(), which
results in EpollSelector.modify() throwing an OSError if you try to
modify an exclusive file descriptor.

In the second PR I implemented a EpollExclusiveSelector.modify() which
unregisters and reregisters the file descriptor to get around the _MOD
behavior. This means no surprise crash when someone updates Python.
But there may be other subtle regressions here: the performance of
modify() is likely going to regress, and someone could have a
dependency on Python actually using _MOD.

But as a rebuttal to those, I have a feeling that nobody really
depends on the performance of modify(), and even in the case where
someone does use it a lot it's likely for the data= path which is
completely unchanged here. I also am struggling to think of a place
where someone would care about the kernel-level changes between _MOD
and _ADD/_DEL but that might be my own lack of imagination or
knowledge of epoll techniques.

Maybe a compromise is to ship EpollExclusveSelector for a release
without it being the default and bump it to the default after seeing
if anyone's turned up any incompatibilities.

--
David Gilman
:DG<
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VB2BBUUJENMWPGXGFMON7UQW27ZOBIB7/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: EPOLLEXCLUSIVE and selectors [ In reply to ]
> On 3 Jul 2022, at 00:42, David Gilman <davidgilman1@gmail.com> wrote:
>
> ?Hello, I'd like to bring up https://bugs.python.org/issue44951 /
> https://github.com/python/cpython/pull/27819 to the mailing list's
> consideration as it has idled for a bit. I would appreciate some
> authoritative feedback on which API design choice is best. I'll also
> recap the PR quickly:
>
> Motivation:
> - There is community demand for EPOLLEXCLUSIVE in Python (see blog
> posts in BPO issue)
> - You can't do this with the existing stdlib code as
> _BaseSelectorImpl.register() raises ValueError on non-READ/WRITE
> constants. (This is why _PollLikeSelector.register() has two
> variables, events and poller_events)
> - It's not an invasive change. The Python API doesn't change much or
> at all and the kernel's EPOLLEXCLUSIVE behavior was carefully designed
> to be backwards compatible with using exclusive and non-exclusive
> watches on the same file descriptor.
>
> I've got two approaches to this. The first extends the EpollSelector
> class with a property to toggle setting the EPOLLEXCLUSIVE on newly
> registered file descriptors:
>
> https://github.com/dgilman/cpython/commit/43174df5bd7a78eedf0692ebbe63a9b943248a74
>
> The second introduces an entirely new sibling class,
> EpollExclusiveSelector, that unconditionally sets it on registration:
>
> https://github.com/dgilman/cpython/commit/554a5bf9c16b6bd82ce47b2d0dd1833f2bdd31cb
>
> The first one was my first attempt but I am leaning towards the
> second. It doesn't require any new API surface area. It also gets
> integrated into the DefaultSelector logic, and even if that shouldn't
> happen it's still easy to swap out your existing selector class for
> the EpollExclusiveSelector class.
>
> Enabling EPOLLEXCLUSIVE by default:
>
>> From the research I did last year my understanding is that
> EPOLLEXCLUSIVE is never a performance drawback on Linux, and in the
> case of a server that gets fast traffic it's a dramatic improvement.
> However, I have not done my own benchmarking (with say, gunicorn,
> which uses the stdlib's selectors module).
>
> Note that EPOLLEXCLUSIVE does have one kernel API break: you can no
> longer use EPOLL_CTL_MOD on an exclusive file descriptor. Python uses
> the _MOD flag under the hood to implement epoll.modify(), which
> results in EpollSelector.modify() throwing an OSError if you try to
> modify an exclusive file descriptor.
>
> In the second PR I implemented a EpollExclusiveSelector.modify() which
> unregisters and reregisters the file descriptor to get around the _MOD
> behavior. This means no surprise crash when someone updates Python.
> But there may be other subtle regressions here: the performance of
> modify() is likely going to regress, and someone could have a
> dependency on Python actually using _MOD.
>
> But as a rebuttal to those, I have a feeling that nobody really
> depends on the performance of modify(), and even in the case where
> someone does use it a lot it's likely for the data= path which is
> completely unchanged here. I also am struggling to think of a place
> where someone would care about the kernel-level changes between _MOD
> and _ADD/_DEL but that might be my own lack of imagination or
> knowledge of epoll techniques.
>
> Maybe a compromise is to ship EpollExclusveSelector for a release
> without it being the default and bump it to the default after seeing
> if anyone's turned up any incompatibilities.s

This would benefit frameworks like twisted. I have seen the thundering herd because of this problem In my work.

Barry
>
> --
> David Gilman
> :DG<
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VB2BBUUJENMWPGXGFMON7UQW27ZOBIB7/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RLVYNSTSSH465JPWAVHEXJACDEE2W4G5/
Code of Conduct: http://python.org/psf/codeofconduct/