Mailing List Archive

new B51 patch available
Patch B51 premature pattern match bug fix.

Patch to fix premature pattern matching with aliases.
e.g if you have an alias for /abs then you
can't have a reach a url called /abstract.html 'cos a
substitution takes place on the start of the string.

file affected : http_alias.c
the string compares for alias matching
now check that the pattern matches a whole
word i.e.
/abs matches /abs
/abs?
/abs#
/abs;
/abs,
/abs/

/abs fails to match anything else.


I didn't think it through too much, but it looks reasonable.

I tried it with aliases for

/abs -> cgi script
/abstruct -> a directory

and with a document called /abstruct.html

Seems to work just fine. I'm assuming that the characters
[nothing] ? # : , / are the only "word" boundary
chars in URLs. If not, then it's easy to add more.


robh
Re: new B51 patch available [ In reply to ]
>
> Robh wrote:
> > Patch B51 premature pattern match bug fix.
> >...
> Looks ok to me.
> What about unmunge_name()? Probably not worth fixing.

I haven't looked at unmunge_name(), what's the problem there ?

> >...I'm assuming that the characters
> >[nothing] ? # : , / are the only "word" boundary
> >chars in URLs. If not, then it's easy to add more.

> No. You should only allow '\0' and '/'.

> Regarding '?'; I now realise that several parts of httpd contain the bug
> of not removing the search part from a URL before passing it to translate_name.

In which case, is it best to leave "?" in the search pattern ?

How about # : , are they harmless ?


rob
Re: new B51 patch available [ In reply to ]
> Suppose I have two aliases:
> Alias /icona/ /local/a
> Alias /trial/ /local/abstract
>
> Then if I try to read /trial/pic.html, and it does not exist, I will
> get an error:
> `The requested URL /icona/bstract/pic.html was not found on this server.'

cute.

So it's doing a reverse translation on the URL.

Maybe we need to fix unmunge_name to return original_url[1] if
orginal_url[0] == '\0'

[orginal_url[0] == '\0' before hitting die() or a redirect.]

It'll still mean that redirected URLs are prone to incorrect reverse
translation, but it'll fix and speed up most unmunges.

-=-=-=
> The point is that a URL _can_ contain these characters, as long as they
> are escaped (for # and ?). But by the time translate_name() has been called
> (at least for the case of the original URL), the URL path has been unescaped,
> and so can legitimately contain 'non-active' # and ?.

Okay, I changed the B51 patch to just look for '/' and '\0'


robh
Re: new B51 patch available [ In reply to ]
Robh wrote:
> Patch B51 premature pattern match bug fix.
>...
Looks ok to me.
What about unmunge_name()? Probably not worth fixing.

>...I'm assuming that the characters
>[nothing] ? # : , / are the only "word" boundary
>chars in URLs. If not, then it's easy to add more.

No. You should only allow '\0' and '/'.

Regarding '?'; I now realise that several parts of httpd contain the bug
of not removing the search part from a URL before passing it to translate_name.

Sigh.

David.
Re: new B51 patch available [ In reply to ]
RobH wrote:
> >
> > Robh wrote:
> > > Patch B51 premature pattern match bug fix.
> > >...
> > Looks ok to me.
> > What about unmunge_name()? Probably not worth fixing.
>
> I haven't looked at unmunge_name(), what's the problem there ?

Suppose I have two aliases:
Alias /icona/ /local/a
Alias /trial/ /local/abstract

Then if I try to read /trial/pic.html, and it does not exist, I will
get an error:
`The requested URL /icona/bstract/pic.html was not found on this server.'

Mind you, this is not much worse than the existing bugs in that routine.

> > >...I'm assuming that the characters
> > >[nothing] ? # : , / are the only "word" boundary
> > >chars in URLs. If not, then it's easy to add more.
>
> > No. You should only allow '\0' and '/'.
>
> > Regarding '?'; I now realise that several parts of httpd contain the bug
> > of not removing the search part from a URL before passing it to translate_name.
>
> In which case, is it best to leave "?" in the search pattern ?
Thinking on, no, it is not safe.

> How about # : , are they harmless ?
Harmless in the sense that they do not carry any semantics, and are just like
any other character in a path.

The point is that a URL _can_ contain these characters, as long as they
are escaped (for # and ?). But by the time translate_name() has been called
(at least for the case of the original URL), the URL path has been unescaped,
and so can legitimately contain 'non-active' # and ?.

David.