Mailing List Archive

Problem that started with pathinfo
Hi,


I started to deploy another installation of my other work on a machine
and created a directory match. Under this match a different docroot was
specified. Before this I had the common extension php. I hoped cherokee
would serve php from this other directory as well.

Extension php FastCGI [non-final]

Directory /kka List & Send [final]
+ Doc.Root

Default List & Send [final]


This is a very common use case. The first thing I noticed when doing a
PHP request: the DOCUMENT_ROOT not being set to the docroot specified
for /kka. Looking further I noticed something odd when I did a request
for a PHP file in /kka, the filename in PATH_TRANSLATED was actually
shorted out. Instead of ending with /myotherdir/index.php it ended with
/myotherdirex.php. After fiddling in the code I noticed that the missing
chars were the length of the web_directory (here: /kka).

** It should be noted that the actual stripping of the web_directory
from the request happens in cherokee_connection_set_custom_droot **

The attached is the 4th iteration of the patch addresses this
(cherokee-pathinfo.diff). But after grepping through the code I expected
a similar pattern to be found it rule_exist.c and rule_extensions.c. I
tried to trigger the issue with rule_exist, but it seems that that
web_directory is never set there. I went reading through rule_extensions.c

/* A previous non-final rule set a custom document root */


I did:

Directory /kka List & Send [non-final]
+ Doc.Root

File exists HTTP Error [non-final]

Default HTTP Error [final]


(I changed List & Send on the end to a different HTTP Error, because I
couldn't see what handler was executed.)


The result is:
conn->request -> "/kka/index.php"
conn->web_direcory -> "/kka"

tmp -> "/myotherdir//index.php"

Notice the double slash; but the system *does* match it, but the HTTP
error doesn't occur, the system chooses to go for handler_file, instead
of the later provided error.

To resume: while we are able to set a new document root in this way, we
cannot overload the handler that did so.


Lets look at the slashes problem a bit deeper. One thing for sure, the
document_root in the configuration file has no slash as suffix, then
were was it introduced?

Easy: cherokee_buffer_add_str (tmp, "/");

Is there a case were a [malformed] request doesn't have a slash to begin
with? No, that case is actually already checked in
cherokee_connection_get_request, if a requests comes in without a slash,
it is a bad request and handled as such.

For the two cases I could find I have adresses this in patch
cherokee-extra-slash.diff.


Ok, back to the overloading of the handler or in Cherokee trace terms:
"merging rule". My List & Send rule is 800, and my HTTP Error rule is
700. This is taken care of in cherokee_config_entry_complete.

How I interpreter this code: if a property-group of a match is not
already set (handler, validator, access, documentroot, authentication,
users, encoders, headers, frontline cache), it can be overwriten. So it
is not possible to have two non-final rules changing the handler, while
it is possible to change to have a second non-final rule changing the
documentroot.


Reading this code the proper example configuration should be;


Extension php FastCGI [non-final]

Directory /kka List & Send [non-final]
+ Doc.Root

Default List & Send [final]


One problem remains: the document root in the CGI base class is only
updated to an alternate document root when a advanced virtual hosting is
enabled. The fix seems trivial: local_directory should be our general
document root.

But still, php-fpm gives me the "File not found." I'll share the
envirionment variables with you that matter.

DOCUMENT_ROOT = "/myotherdir"
PATH_INFO = "/index.php"
REQUEST_URI = "/kka/index.php"
SCRIPT_URL = "/index.php"
SCRIPT_NAME = "/kka"
PATH_TRANSLATED = "/myotherdir/index.php"
SCRIPT_FILENAME = "/index.php"

At this point my only hunch was that SCRIPT_NAME doesn't look legit. I
went to the code that generation of SCRIPT_NAME depended on check_file
or not, when enabled the check_file again I got:

SCRIPT_NAME = "/kka/index.php"
SCRIPT_FILENAME = "/myotherdir/index.php"

...and then my code worked (or to be honest: it failed hard because I
failed to set up the proper database connect string).


Going back to the path info problem. When I ran the QA tests that
backfired on the first iterations of the cherokee-pathinfo.diff (the
patch that prevented that a part of the filename was removed from the
request). After that QA tests passed.

===

Questions:
1) cherokee-pathinfo.diff is a true bugfix. I intent to apply it.
Objections?

2) cherokee-extra-slash.diff is a bugfix which is nice to have, since no
QA tests break I guess this is a good fix. I cannot see any place were
it could be abused and have checked doing crazy stuff by malformed paths
in the configuration file.

3) cherokee-cgi-documentroot.diff Should the document root be overridden
if the user did so. I tend to say yes since we pass all QA tests with it
gloriously. (aka there is no test that depends on it)

===

tl;dr
If you want to setup PHP with subdirs just do so as you were learned by
the existing documentation. Including the check_file option.



Stefan