Mailing List Archive

Bugs in getparents()
The patch below fixes the following bugs in getparents()
* /foo/bar.. was transformed to /foo
* /.../ was transformed to /
* /foo/.. was transformed to "" (instead of /)

which led to errors requesting URLs like http://host/foo..

getparents() did not remove ./ from the path. In addition to the CERT-reported
problems, this causes the following bug:
* /foo/./.. was transformed to /foo (instead of /)

I have taken the correct behaviour to be that defined by RFC 1808
(relative uniform resource locators) which is close to the behaviour
of the Unix filesystem.

The code had so many problems that I was forced to rewrite it. I have
tested it extensively in a standalone program on many inputs, including
the examples from RFC 1808.

By removing ./ from the path, this patch fixes the majority of the problems
reported by CERT. The remaining problems were caused by a bug in no2slash()
not combining 3 or more slashes to a single slash. A small patch for
that bug is also included.

David.

------------------- getparents.patch --------------
From: drtr@ast.cam.ac.uk (David Robinson)
Subject: Fix problems with .. and ./ in paths
Affects: util.c
ChangeLog: Rewrite getparents() to conform to RFC 1808, including the removal
of ./ from the path.

*** util.c.orig2 Mon Nov 6 18:20:35 1995
--- util.c Thu Nov 9 13:43:59 1995
***************
*** 188,221 ****
*/
void getparents(char *name)
{
! int l=0,w=0;
! const char *lookfor="..";

while(name[l]!='\0') {
! if(name[l]!=lookfor[w]) (w>0 ? (l-=(w-1),w=0) : l++);
! else {
! if(lookfor[++w]=='\0') {
! if((name[l+1]=='\0') ||
! ((name[l+1]=='/') &&
! (((l > 3) && (name[l-2] == '/'))
! || (l<=3)))) {
! register int m=l+1,n;
!
! l=l-3;
! if(l>=0) {
! while((l!=0) && (name[l]!='/')) --l;
! }
! else l=0;
! n=l;
! while((name[n]=name[m])) (++n,++m);
! w=0;
! }
! else w=0;
}
! else ++l;
! }
}
! }

void no2slash(char *name) {
register int x,y;
--- 188,245 ----
*/
void getparents(char *name)
{
! int l, w;
!
! /* Four paseses, as per RFC 1808 */
! /* a) remove ./ path segments */
!
! for (l=0, w=0; name[l] != '\0';)
! {
! if (name[l] == '.' && name[l+1] == '/' && (l == 0 || name[l-1] == '/'))
! l += 2;
! else
! name[w++] = name[l++];
! }
!
! /* b) remove trailing . path segment */
! if (w == 1 && name[0] == '.') w--;
! else if (w > 1 && name[w-1] == '.' && name[w-2] == '/') w--;
! name[w] = '\0';
!
! /* c) remove all xx/../ segments. (including leading ../ and /../) */
! l = 0;

while(name[l]!='\0') {
! if(name[l] == '.' && name[l+1] == '.' && name[l+2] == '/' &&
! (l == 0 || name[l-1] == '/')) {
! register int m=l+3,n;
!
! l=l-2;
! if(l>=0) {
! while(l >= 0 && name[l] != '/') l--;
! l++;
! }
! else l=0;
! n=l;
! while((name[n]=name[m])) (++n,++m);
}
! else ++l;
! }
!
! /* d) remove trailing xx/.. segment. */
! if (l == 2 && name[0] == '.' && name[1] == '.') name[0] = '\0';
! else if (l > 2 && name[l-1] == '.' && name[l-2] == '.' && name[l-3] == '/')
! {
! l = l - 4;
! if (l >= 0)
! {
! while (l >= 0 && name[l] != '/') l--;
! l++;
! }
! else l = 0;
! name[l] = '\0';
}
! }

void no2slash(char *name) {
register int x,y;
------------------- end getparents.patch --------------

------------------- no2slash.patch --------------
From: drtr@ast.cam.ac.uk (David Robinson)
Subject: Fix problems with ///
Affects: util.c
ChangeLog: no2slash() should convert /// -> /

*** util.c.orig3 Thu Nov 9 13:43:59 1995
--- util.c Thu Nov 9 14:33:12 1995
***************
*** 244,253 ****
void no2slash(char *name) {
register int x,y;

! for(x=0; name[x]; x++)
if(x && (name[x-1] == '/') && (name[x] == '/'))
for(y=x+1;name[y-1];y++)
name[y-1] = name[y];
}

char *make_dirstr(pool *p, char *s, int n) {
--- 244,254 ----
void no2slash(char *name) {
register int x,y;

! for(x=0; name[x];)
if(x && (name[x-1] == '/') && (name[x] == '/'))
for(y=x+1;name[y-1];y++)
name[y-1] = name[y];
+ else x++;
}

char *make_dirstr(pool *p, char *s, int n) {
------------------- end no2slash.patch --------------
Re: Bugs in getparents() [ In reply to ]
Hmmm... at first blush, this looks OK (though I haven't tried running
with it yet). Two comments:

1) The combination of this patch and my initial "fix" to the
ScriptAlias non-bug would "fix" every print-the-script gotcha
*that* *I'm* *currently* *aware* *of*, without breaking currently
valid URIs or sending new redirects. I could live with that, but
I'm still not sure we're aware of all such anomalies, and would not
be entirely comfortable telling the users (even implicitly, by not
warning them about these sorts of problems) that we did. Still, I
could live with it, if it would help get something out the door.

2) It might be better to resolve /foo//../bar to /bar rather than
/foo/bar. That might not be *strictly* in accord with RFC 1808,
but it's not much of a deviation, it would probably be more in
accord with user expectations, and it would eliminate the
<!--#include virtual="../foo"--> anomaly, not that I consider that
any big deal either. This would involve adding two extra lines of
code to the rewritten getparents() routine, one each in parts three
and four (loops skipping over extra slahes preceding '/..', viz. my
earlier getparents() patch).

rst