Mailing List Archive

uploaded 37_tilde_ok.0.8.15.patch
One of the earlier releases that I didn't test on my server changed
the handling of patch escapes such that all FancyIndex URL paths
are escaped. Unfortunately, Apache uses the "~" character for default
user directory prefixes. The result is that Apache 0.8.15 is changing
the URL from /~user to /%7euser in a rather arbitrary fashion.
This is bad, even though it may be considered as being RFC 1738 compliant,
since we (the authors of the URL specs) are planning on including tilde
in the allowed character set anyway the next time the specs are revised.

In any case, it is guaranteed to cause users to complain and muck up
a browser's history file.

This is 37_tilde_ok.0.8.15.patch:

From: Roy Fielding <fielding@ics.uci.edu>
Subject: Changes path escaping to allow the tilde "~" character in paths.
Affects: util.c
ChangeLog: Do not escape tilde "~" character during the creation of
FancyIndexes
Comments: Yes, I know it isn't strictly according to RFC 1738, but that
document is overly restrictive (i.e., wrong) in that regard
and will be changed in the next revision.

*** util.c.dist Tue Oct 10 15:10:23 1995
--- util.c Mon Oct 30 00:08:29 1995
***************
*** 503,509 ****
for(x=0,y=0; segment[x]; x++,y++) {
char c=segment[x];
if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
! && ind("$-_.+!*'(),:@&=",c) == -1)
{
c2x(c,&copy[y]);
y+=2;
--- 503,509 ----
for(x=0,y=0; segment[x]; x++,y++) {
char c=segment[x];
if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
! && ind("$-_.+!*'(),:@&=~",c) == -1)
{
c2x(c,&copy[y]);
y+=2;
***************
*** 530,536 ****
{
char c=*path;
if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
! && ind("$-_.+!*'(),:@&=/",c) == -1)
{
c2x(c,s);
s+=3;
--- 530,536 ----
{
char c=*path;
if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
! && ind("$-_.+!*'(),:@&=/~",c) == -1)
{
c2x(c,s);
s+=3;
Re: uploaded 37_tilde_ok.0.8.15.patch [ In reply to ]
>
> One of the earlier releases that I didn't test on my server changed
> the handling of patch escapes such that all FancyIndex URL paths
> are escaped. Unfortunately, Apache uses the "~" character for default
> user directory prefixes. The result is that Apache 0.8.15 is changing
> the URL from /~user to /%7euser in a rather arbitrary fashion.

Huh? I can't find code that does this via mod_dir anywhere? As far as I can
see, mod_dir paths are always relative. How do I reproduce this problem?
Besides, is this the correct way to fix the problem? What if the file really
starts with a ~?

> This is bad, even though it may be considered as being RFC 1738 compliant,
> since we (the authors of the URL specs) are planning on including tilde
> in the allowed character set anyway the next time the specs are revised.

As mentioned above, what about the case when there is a real file starting
with ~? I know that this is a stupid thing to do, but the RFC doesn't require
the user to be sensible.

>
> In any case, it is guaranteed to cause users to complain and muck up
> a browser's history file.
>
> This is 37_tilde_ok.0.8.15.patch:
>
> From: Roy Fielding <fielding@ics.uci.edu>
> Subject: Changes path escaping to allow the tilde "~" character in paths.
> Affects: util.c
> ChangeLog: Do not escape tilde "~" character during the creation of
> FancyIndexes
> Comments: Yes, I know it isn't strictly according to RFC 1738, but that
> document is overly restrictive (i.e., wrong) in that regard
> and will be changed in the next revision.
>
> *** util.c.dist Tue Oct 10 15:10:23 1995
> --- util.c Mon Oct 30 00:08:29 1995
> ***************
> *** 503,509 ****
> for(x=0,y=0; segment[x]; x++,y++) {
> char c=segment[x];
> if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
> ! && ind("$-_.+!*'(),:@&=",c) == -1)
> {
> c2x(c,&copy[y]);
> y+=2;
> --- 503,509 ----
> for(x=0,y=0; segment[x]; x++,y++) {
> char c=segment[x];
> if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
> ! && ind("$-_.+!*'(),:@&=~",c) == -1)
> {
> c2x(c,&copy[y]);
> y+=2;
> ***************
> *** 530,536 ****
> {
> char c=*path;
> if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
> ! && ind("$-_.+!*'(),:@&=/",c) == -1)
> {
> c2x(c,s);
> s+=3;
> --- 530,536 ----
> {
> char c=*path;
> if((c < 'A' || c > 'Z') && (c < 'a' || c > 'z') && (c < '0' || c >'9')
> ! && ind("$-_.+!*'(),:@&=/~",c) == -1)
> {
> c2x(c,s);
> s+=3;

--
Ben Laurie Phone: +44 (181) 994 6435
Freelance Consultant Fax: +44 (181) 994 6472
and Technical Director Email: ben@algroup.co.uk
A.L. Digital Ltd,
London, England.
Re: uploaded 37_tilde_ok.0.8.15.patch [ In reply to ]
>> One of the earlier releases that I didn't test on my server changed
>> the handling of patch escapes such that all FancyIndex URL paths
>> are escaped. Unfortunately, Apache uses the "~" character for default
>> user directory prefixes. The result is that Apache 0.8.15 is changing
>> the URL from /~user to /%7euser in a rather arbitrary fashion.
>
> Huh? I can't find code that does this via mod_dir anywhere? As far as I can
> see, mod_dir paths are always relative. How do I reproduce this problem?

mod_dir calls os_escape_path when it builds a fancy index. To reproduce
the problem, create a ~user/public_html directory and test subdirectory,
put some files in the test subdirectory, turn on FancyIndexing, and go to

/~user/test/

and look at the URLs generated by the fancy index. For illustration,
select the Parent Directory anchor and you will be sent to

/%7euser/

which is not what we want to happen.

> Besides, is this the correct way to fix the problem? What if the file really
> starts with a ~?

Makes no difference -- the client will not interpret the "~" character
any differently than "%7e", and that is all that matters here. The correct
behavior (for Apache) is to treat "~" as a normal character.

.....Roy
Re: uploaded 37_tilde_ok.0.8.15.patch [ In reply to ]
Roy wrote:
>One of the earlier releases that I didn't test on my server changed
>the handling of patch escapes such that all FancyIndex URL paths
>are escaped. Unfortunately, Apache uses the "~" character for default
>user directory prefixes. The result is that Apache 0.8.15 is changing
>the URL from /~user to /%7euser in a rather arbitrary fashion.
>This is bad, even though it may be considered as being RFC 1738 compliant,
>since we (the authors of the URL specs) are planning on including tilde
>in the allowed character set anyway the next time the specs are revised.
>
>In any case, it is guaranteed to cause users to complain and muck up
>a browser's history file.

But if the browser does not equate %7E to ~ then it is broken. Do we know
of any such broken software existing?

David.
Re: uploaded 37_tilde_ok.0.8.15.patch [ In reply to ]
>>In any case, it is guaranteed to cause users to complain and muck up
>>a browser's history file.
>
> But if the browser does not equate %7E to ~ then it is broken. Do we know
> of any such broken software existing?

Ummm, all existing versions of Mosaic and Netscape? Last time I checked,
the only software that did such canonicalization was MOMspider, the CERN
caching proxy, and Arena. Clients don't do canonicalization unless they
have to, because there exist servers which only accept invalid URLs.

From a philosophical perspective, the server should either always
treat "~" as an unsafe character or always treat it as an okay character.
Since we inherited the OKness from NCSA httpd (and don't even provide
an alternative to /~user), it isn't right for the server to munge it
into /%7euser. In any case, "~" is safe on all systems except 7-bit-only
terminals located in Finland, and even there it will work unless someone
writes the URL on a napkin, carries it to another country where the same
scandinavian character is remapped to ISO-8859-1, and tries to type it
in again.

Don't you just love the rationale behind some standardization decisions?

.....Roy