Mailing List Archive

Issue 1364 in cherokee: Reverse proxy mishandling of errors cause writes to random fd ... with very bad consequences
Status: New
Owner: ----

New issue 1364 by 246...@gmail.com: Reverse proxy mishandling of errors
cause writes to random fd ... with very bad consequences
http://code.google.com/p/cherokee/issues/detail?id=1364

To reproduce the issue:

1. Configure cherokee with a reverse proxy handler to a given backend and
a 5 second timeout
2. In that backend, just sleep for 10 s before responding
3. Do two POST one after the other
4. In the error log you'll find :

-----
[11/06/2012 14:26:49.791] (error) socket.c:708 - Could not write to socket:
write(18, ..): 'Bad file descriptor' | It looks like you've hit a bug
in the server.
Please, do not hesitate to report it at
http://bugs.cherokee-project.com/
so the developer team can fix it.
-----


If you enable trace mode, you'll see this when the first 'Gateway timeout
is received':

----
{0x7F34C34DB700} [11/06/2012 13:31:36.423] thread.c:0562 (
process_polling_connections): thread (0x63b4f0) processing polling conn
(0x6bea70, Init connection): Time out
{0x7F34C34DB700} [11/06/2012 13:31:36.423] socket.c:1052 (
cherokee_socket_bufwrite): write fd=17 len=187 ret=0 written=187
{0x7F34C34DB700} [11/06/2012 13:31:36.423] util.c:1665
( cherokee_fd_close): fd=17 re=0
{0x7F34C34DB700} [11/06/2012 13:31:36.423] socket.c:0210
( cherokee_socket_close): fd=17 is_tls=0 re=0
{0x7F34C34DB700} [11/06/2012 13:31:36.425] util.c:1665
( cherokee_fd_close): fd=18 re=0
{0x7F34C34DB700} [11/06/2012 13:31:36.425] connection.c:0395 (
cherokee_connection_clean): conn 0x6bea70, has headers 0
----

As you can see the fd=18 (which is the fd to the backend), is closed, but
with cherokee_fd_close rather than cherokee_socket_close.

So when the second post arrive you get:

----
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:0896 (
cherokee_handler_proxy_init): Entering init 'get conn'
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:0934 (
cherokee_handler_proxy_init): Entering phase 'preconnect'
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:0979 (
cherokee_handler_proxy_init): Entering phase 'connect': pconn=0x6d4110
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:1044 (
cherokee_handler_proxy_init): Entering phase 'build headers'
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:0319
( add_request): Client request: '/wait/10/10'
{0x7F34BE5FC700} [11/06/2012 13:31:36.430] handler_proxy.c:1059 (
cherokee_handler_proxy_init): Entering phase 'send headers'
{0x7F34BE5FC700} [11/06/2012 13:31:36.431] socket.c:1052 (
cherokee_socket_bufwrite): write fd=18 len=415 ret=-1 written=0
----

It tries to reuse fd=18 because the 'cherokee_socket' object was never
closed but the underlying fd was ... If by anychance the fd=18 number had
been reused in the mean time, the 'send header' would inject data to a
random other connection.


I traced the error to the 'got_all' flag in handler_proxy which is set to
true when all POST data is received ... which doesn't make sense ...
got_all indicates if the backend is ready to be used. With this flag set,
during the 'free' of the connection because of the timeout, the connection
is placed back into the pool to be reused ... which is obviously wrong.

I applied the attached patch (that doesn't set the 'true' flag of got_all)
and this fixes this issue, the connection is then closed (keepalive flag is
set to false in cherokee_handler_proxy_free, causing the
cherokee_handler_proxy_conn_release call to close the connection and not
reuse it).

HOWEVER the fd is still referenced as the 'polling' fd of the
connection ... and so close() will be called on it by
cherokee_connection_{free,clean} as part of the connection cleanup.

So somewhere we must also remove the proxy connection from the 'polling'
but I'm not sure what the correct way to do this.

Attachments:
0001-handler_proxy-Don-t-set-the-got-all-flag-when-POST-d.patch 1.3 KB

_______________________________________________
Cherokee-dev mailing list
Cherokee-dev@lists.octality.com
http://lists.octality.com/listinfo/cherokee-dev
Re: Issue 1364 in cherokee: Reverse proxy mishandling of errors cause writes to random fd ... with very bad consequences [ In reply to ]
Updates:
Status: Started
Owner: ste...@konink.de
Labels: Type-Defect Priority-High Component-Logic Usability
Milestone-Stable

Comment #1 on issue 1364 by ste...@konink.de: Reverse proxy mishandling of
errors cause writes to random fd ... with very bad consequences
http://code.google.com/p/cherokee/issues/detail?id=1364

(No comment was entered for this change.)

_______________________________________________
Cherokee-dev mailing list
Cherokee-dev@lists.octality.com
http://lists.octality.com/listinfo/cherokee-dev
Re: Issue 1364 in cherokee: Reverse proxy mishandling of errors cause writes to random fd ... with very bad consequences [ In reply to ]
Comment #2 on issue 1364 by 246...@gmail.com: Reverse proxy mishandling of
errors cause writes to random fd ... with very bad consequences
http://code.google.com/p/cherokee/issues/detail?id=1364

Any news on this ? The complete explanation and patch is provided, it
should be fairly straightforward ...

_______________________________________________
Cherokee-dev mailing list
Cherokee-dev@lists.octality.com
http://lists.octality.com/listinfo/cherokee-dev
Re: Issue 1364 in cherokee: Reverse proxy mishandling of errors cause writes to random fd ... with very bad consequences [ In reply to ]
Updates:
Status: Fixed

Comment #3 on issue 1364 by ste...@konink.de: Reverse proxy mishandling of
errors cause writes to random fd ... with very bad consequences
http://code.google.com/p/cherokee/issues/detail?id=1364

We also need a QA test to make sure this does not happen anymore. I have
just commited this patch! Thank you very much.

_______________________________________________
Cherokee-dev mailing list
Cherokee-dev@lists.octality.com
http://lists.octality.com/listinfo/cherokee-dev