Mailing List Archive

New version of B60-leading-slash-2.txt for 0.5.1
I've uploaded a new version of B60. httpd now issues a 404 not found
error in response to requests without a leading /.

I had to change most of the calls to translate_name(), as it wasn't appropriate
for translate_name() to call die() directly; instead it not returns BAD_URL

So I'd appreciate some testing of this patch.

David.
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
Hmmm... two comments. First off, I still think these should get bounced
with a 400, instead of a 404. Second, it seems to me that this could be
handled fairly simply by just sticking something like

if (url[0] != '/') die (BAD_REQUEST, ...)

in process_request, before it dispatches to the request-specific code;
that might be a little less error-prone than hacking every call to
translate_name.

rst
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
On Mon, 10 Apr 1995, David Robinson wrote:
> Besides which, I have a hidden agenda on this. Consider a URL of the form
> http://somehost.domain/../path/file
> Currently, translate_name() on this calls getparents() which simply deletes
> the leading ../ . Instead, it should really return a 400 or 404 error.
> So I want getparents() to return a code indicating that the URL was potentially
> bad, and hence I need translate_name to return a BAD_URL too.

Really? Is something like http://host/path/../path2/file.html disallowed
by the URL spec? I don't think it is - after all how do you know that
"path" isn't really a script that takes its arguments via PATH_INFO, with
".." being a valid part of its path.... This is an issue with some broken
browsers out there that misinterpret relative URL's that point up a
directory.

Roy Fielding is The Man when it comes to relative URL's - I'll wait for
his response to whether something like
http://host/path/../path2/file.html should return a 400, 404, or 200 if
http://host/path2/file.html really exists.

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@hotwired.com brian@hyperreal.com http://www.hotwired.com/Staff/brian/
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
>Hmmm... two comments. First off, I still think these should get bounced
>with a 400, instead of a 404.
Yes, it should. I missed your earlier comment to that effect.

>Second, it seems to me that this could be handled fairly simply by just
>sticking something like
> if (url[0] != '/') die (BAD_REQUEST, ...)
>in process_request, before it dispatches to the request-specific code;
>that might be a little less error-prone than hacking every call to
>translate_name.

But the problem occurs wherever a URL or virtual path is used. The one place
I can guarantee to catch these is in translate_name. If I fixed
process_request, then I'd also have to check the code around every other
call to translate_name. Admittedly, there would be few of them, but it does
include code I don't understand like http_mime_db.c...

Besides which, I have a hidden agenda on this. Consider a URL of the form
http://somehost.domain/../path/file
Currently, translate_name() on this calls getparents() which simply deletes
the leading ../ . Instead, it should really return a 400 or 404 error.
So I want getparents() to return a code indicating that the URL was potentially
bad, and hence I need translate_name to return a BAD_URL too.

David.
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
>I still think these should get bounced with a 400, instead of a 404.
I've changed the patch to do this.

David.
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
Brian wrote:
>On Mon, 10 Apr 1995, David Robinson wrote:
>> Besides which, I have a hidden agenda on this. Consider a URL of the form
>> http://somehost.domain/../path/file
>> Currently, translate_name() on this calls getparents() which simply deletes
>>n> the leading ../ . Instead, it should really return a 400 or 404 error.
>> So I want getparents() to return a code indicating that the URL was potentially
>> bad, and hence I need translate_name to return a BAD_URL too.
>
>Really? Is something like http://host/path/../path2/file.html disallowed
>by the URL spec? I don't think it is - after all how do you know that
>"path" isn't really a script that takes its arguments via PATH_INFO, with
>".." being a valid part of its path.... This is an issue with some broken
>browsers out there that misinterpret relative URL's that point up a
>directory.
>
>Roy Fielding is The Man when it comes to relative URL's - I'll wait for
>his response to whether something like
>http://host/path/../path2/file.html should return a 400, 404, or 200 if
>http://host/path2/file.html really exists.

Sorry Brian, you didn't read what I wrote. I wasn't talking about relative
URLs (although I did not make that clear.)

Firstly, in response to your (mistaken) questions:
1. http://host/path/../path2/file.html is an absolute URL, not a relative
URL, as it starts with a scheme.

2. In the context of an absolute URL, the semantics of a .. path component
are not defined. It is up to the server how it interprets this.
Thus http://host/path/../path2/file.html could even be a different document
to http://host/path2/file.html

3. For a relative URL, the semantics of .. are defined.
So path/../path2/file.html in the context of http://host/
is _defined_ to correspond to the absolute URL http://host/path2/file.html

To clarify:
I believe that the path sent in the http request is the path component of
an absolute URL, not a relative URL. So we are not considering relative
URLs here.

The current situation:
So the URL spec does not mandate any specific action in response to
GET /path/../path2/file.html. We can do whatever we want. However, I think
we would be insane not to map this onto /path2/file.html

My question (actually just an comment):
What should we do with http://host/../path/file.html, i.e. a
GET /../file.html ? Currently httpd will map this to /file.html. I was
suggesting that this is wrong, and that this should return 400 or 404.

David.

References:
draft-ietf-uri-relative-url-06.txt: `Relative Uniform Resource Locators'.
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
Davis said:

> 1. http://host/path/../path2/file.html is an absolute URL, not a relative
> URL, as it starts with a scheme.

yep.

> 2. In the context of an absolute URL, the semantics of a .. path component
> are not defined. It is up to the server how it interprets this.
> Thus http://host/path/../path2/file.html could even be a different document
> to http://host/path2/file.html

yep.

> 3. For a relative URL, the semantics of .. are defined.
> So path/../path2/file.html in the context of http://host/
> is _defined_ to correspond to the absolute URL http://host/path2/file.html

yep.

> To clarify:
> I believe that the path sent in the http request is the path component of
> an absolute URL, not a relative URL. So we are not considering relative
> URLs here.

mostly yep. It is an abs_path relative URL, and thus ".." segments
do not have any special meaning to the resolver.

> The current situation:
> So the URL spec does not mandate any specific action in response to
> GET /path/../path2/file.html. We can do whatever we want. However, I think
> we would be insane not to map this onto /path2/file.html

Ummm, either way is okay; I personally am in favor of rejecting invalid
URLs in the most painful way possible (to the user), so that they won't
continue using them. "PR" is not a problem where I work. ;-)
Note, however, that if a CGI script controls that part of the resource
name mapping, you cannot safely (and should not) change the path.
I can't remember what needs to be done for PATH_TRANSLATED.

> My question (actually just an comment):
> What should we do with http://host/../path/file.html, i.e. a
> GET /../file.html ? Currently httpd will map this to /file.html. I was
> suggesting that this is wrong, and that this should return 400 or 404.

404 Not Found -- there is nothing wrong with the request itself
(other than there does not exist any resource starting with "/../").
Same goes for "/./" and "//", though these ones are more innocuous.

.....Roy
Re: New version of B60-leading-slash-2.txt for 0.5.1 [ In reply to ]
Roy wrote:
> > The current situation:
> > So the URL spec does not mandate any specific action in response to
> > GET /path/../path2/file.html. We can do whatever we want. However, I think
> > we would be insane not to map this onto /path2/file.html
>
> Ummm, either way is okay; I personally am in favor of rejecting invalid
> URLs in the most painful way possible (to the user), so that they won't
> continue using them. "PR" is not a problem where I work. ;-)
> Note, however, that if a CGI script controls that part of the resource
> name mapping, you cannot safely (and should not) change the path.
> I can't remember what needs to be done for PATH_TRANSLATED.

I'm not sure I agree. I had already thought about this, and the conclusion
I had come to was:
if the server defines ../ to remove the previous path element, then it
would be too confusing not to do this in all circumstances. Consider
/dir/file.html/../file2.html -> /dir/file.html
But what about
/dir/script.cgi/patharg/../file2.html
Currently, script.cgi gets a PATH_INFO of /file2.html.
It _could_ be given /patharg/../file2.html but I would argue that in the
context of the behaviour of httpd, it would be wrong for the script to
treat this as anything but /file2.html
And what about
/dir/script.cgi/../file2.html
Is this the file /dir/file2.html or the script /dir/script.cgi with
PATH_INFO=/../file2.html ??
I'm not sure what the best solution is.

> > My question (actually just an comment):
> > What should we do with http://host/../path/file.html, i.e. a
> > GET /../file.html ? Currently httpd will map this to /file.html. I was
> > suggesting that this is wrong, and that this should return 400 or 404.
>
> 404 Not Found -- there is nothing wrong with the request itself
> (other than there does not exist any resource starting with "/../").
> Same goes for "/./" and "//", though these ones are more innocuous.

Yes. /./ is removed.

David.