Mailing List Archive

redirection revisited
Hi y'all,


I've been playing with redirection today, I extended the
custom error responses code to work with a common or garden
variety of redirect, so that the new "REDIRECT_" prefixed
cgi-var 'extensions' are available after a redirect.

Also, while I was doing this I fixed,

1) a die() redirect after an AUTH REQ will always return a 401 status,
so you can have a beautiful html file to explain how to register,
'cos it'll come back with a 401 instead of a 200 status.

2) all redirects, whether explicit or after a die() will log both
the original URL and the new one,
e.g.

ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/loc?weeeeeee HTTP/1.0" 302 0
ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/tester" 200 1506


{/cgi-bin/loc redirects to /cgi-bin/tester}

and

ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/crach?bleah HTTP/1.0" 500 0
ooo.lanl.gov - - [03/A .. ] "GET /try.html" 200 1148


{/cgi-bin/crach?bleah is a buggy script, with ErrorDocument 500 pointing to
/try.html}


You don't deserve it, but I also introduced a new variable "REDIRECT_STATUS"
which will let scripts now why the redirect happened. e.g. if it is
set to 500, then we know that REDIRECT_URL exploded on us. So you can
now write a general purpose problem reporter/analyzer.

A cleaned up patch will be uploaded to hyperreal tomorrow, assuming
nobody has a problem with the above.


robh
Re: redirection revisited [ In reply to ]
To the NCSA folk and Rob McCool if he's interested in the idea..

Shall we try to make "REDIRECT_" a de facto standard for a CGI 1.2,
where any CGI variable prefixed by this string can be assumed to
have been built from an earlier CGI var that existed prior to
a redirect. And let's throw in DOCUMENT_ROOT while we're at it.

I've also added

REDIRECT_STATUS
and
REDIRECT_URL

Maybe I need to change "REDIRECT_URL" to "REDIRECT_DOCUMENT_URI",
but cgi's don't have a DOCUMENT_URI, they use SCRIPT_NAME to
do the same job.

Hmmm, CGI is a bit of a mess anyway.

The new "standard" would make "REDIRECT_*" optional. Scripts cannot
assume they'll see any.


robh,
redirecting to /home/kitchen/fridge
Re: redirection revisited [ In reply to ]
> Mmm, is there any way that we can have a different .html file depending on
> the accessing URL?

Arrgghhh :-)

Why not redirect to a script, have the script decide what to
output. All the info you need to decide how to respond will be
available via "REDIRECT_" prefixed cgi-vars.

> Cool, BUT I'M NOT SATISFIED YET! ;) Is there any indication that the initial
> access resulted in the redirect, other than entries being next to each
> other and having the same date? Time stamp resolution isn't enough
> to determine precisely which initial access is responsible for
> a cascade of redirects [if we even allow cascades].

I though about marking the logfile entries to attempt to tie the
two URLs together, but without breaking the current standard log
format (CLF) it looks tricky...

Your server transaction number (process id perhaps) looks interesting, but
it might break CLF analyzers.

Another partial solution might be to add a server-side method for logging,
say "REDIRECT", e.g.

ooo.lanl.gov - - [04/A .. 0600] "GET /redirector HTTP/1.0" 302 -
ooo.lanl.gov - - [04/A .. 0600] "REDIRECT /redirectee HTTP/1.0" 202 12345

This would make redirections stand out in the logs.

But we still need to tie the two entries together. Any ideas on where
we could record a transaction number without upsetting the CLF ?

Would it be safe to log them as,

ooo.lanl.gov - - [04/A .. 0600] "GET /redirector;NNNNN HTTP/1.0" 302 -
ooo.lanl.gov - - [04/A .. 0600] "GET /redirectee;NNNNN HTTP/1.0" 202 12345

Any log file gurus out there ?

robh
Re: redirection revisited [ In reply to ]
>
> > But we still need to tie the two entries together. Any ideas on where
> > we could record a transaction number without upsetting the CLF ?
> >
> > Would it be safe to log them as,
> >
> > ooo.lanl.gov - - [04/A .. 0600] "GET /redirector;NNNNN HTTP/1.0" 302 -
> > ooo.lanl.gov - - [04/A .. 0600] "GET /redirectee;NNNNN HTTP/1.0" 202 12345
> >
> > Any log file gurus out there ?
>
> What about using up the first '-'? Huh, anyway the trick is to redefine CLF
> but I dun't even know where that protocol is maintained. Randy? any chanse we
> could have some documentation on the Common Log file Format available from
> hyperreal? Perhaps there's an FAQ or an RFC or a WTF we can web up.

I'll see what I can dig up. I assume that everyone knows to look at

http://www.hyperreal.com/apache/docs for current Apache documentation.

My HPsUX cluster server is having trouble getting it up today.
I'll see what I can find over lunch (if I get one)....
Re: redirection revisited [ In reply to ]
> Also, while I was doing this I fixed,

[the man doesn't sleep?]

> 1) a die() redirect after an AUTH REQ will always return a 401 status,
> so you can have a beautiful html file to explain how to register,
> 'cos it'll come back with a 401 instead of a 200 status.

Mmm, is there any way that we can have a different .html file depending on
the accessing URL? The way I see it different 'applications' on the server
would have different test they wished to display to the user. Suppose Joe-Average
was issuing a passwd to a mail-order form, well, they might prefer to see
an explaination from "Bumpy's Friendly Mail-order" rather than from
Webmaster@Oh.Rilly.Corp. Perhaps there's scope for yet another obfuscation of the
.htaccess file which allows *users* to define the page to be used in the event of
an error:

--- cut here ---
[... usual .htaccess stuff ...]
ErrorPage 500 bumpys_error_report_form.html
ErrorPage 401 bumpys_howto_register_FAQ.html
--- cut here ---

So any error 401 reached in the same directory as the .htaccess file can result
in Bumpy's FAQ rather than the admin's FAQ-of-last-resort.

> 2) all redirects, whether explicit or after a die() will log both
> the original URL and the new one,
> e.g.
>
> ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/loc?weeeeeee HTTP/1.0" 302 0
> ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/tester" 200 1506
>
>
> {/cgi-bin/loc redirects to /cgi-bin/tester}
>
> and
>
> ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/crach?bleah HTTP/1.0" 500 0
> ooo.lanl.gov - - [03/A .. ] "GET /try.html" 200 1148
>
>
> {/cgi-bin/crach?bleah is a buggy script, with ErrorDocument 500 pointing to
> /try.html}

Cool, BUT I'M NOT SATISFIED YET! ;) Is there any indication that the initial
access resulted in the redirect, other than entries being next to each other and
having the same date? Time stamp resolution isn't enough to determine precisely
which initial access is responsible for a cascade of redirects [if we even allow
cascades].

I'm a real-life OO-fascist, the sort of guy that likes to see unique identifiers
associated with all transactions. Can we have some kind of transaction identifier,
a unique value chosen at the initial access time, which can be used to tag all
log entries resulting from an access. This could be a CGI variable too
SERVER_TRANSACTION (or whatever), so that scripts can get a grip on things, perhaps.

So, if the server's logging transactions, and we've enabled the log-file to record
things we might see:

ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/loc?weeeeeee HTTP/1.0" 302 0 46393272
ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/tester" 200 1506 46393272
ooo.lanl.gov - - [03/A .. ] "GET /" 200 2425 28372104
ooo.lanl.gov - - [03/A .. ] "GET /cgi-bin/crach?bleah HTTP/1.0" 500 0 38626102
ooo.lanl.gov - - [03/A .. ] "GET /try.html" 200 1148 38626102

Now we're *really* sure which access caused which redirect.

Well, that's one way of doing it.

> You don't deserve it, but I also introduced a new variable "REDIRECT_STATUS"
> which will let scripts now why the redirect happened. e.g. if it is
> set to 500, then we know that REDIRECT_URL exploded on us. So you can
> now write a general purpose problem reporter/analyzer.

Joy.

> A cleaned up patch will be uploaded to hyperreal tomorrow, assuming
> nobody has a problem with the above.
>
>
> robh

I went to the same school as this guy you know. ;)

Ay.
Re: redirection revisited [ In reply to ]
> But we still need to tie the two entries together. Any ideas on where
> we could record a transaction number without upsetting the CLF ?
>
> Would it be safe to log them as,
>
> ooo.lanl.gov - - [04/A .. 0600] "GET /redirector;NNNNN HTTP/1.0" 302 -
> ooo.lanl.gov - - [04/A .. 0600] "GET /redirectee;NNNNN HTTP/1.0" 202 12345
>
> Any log file gurus out there ?

What about using up the first '-'? Huh, anyway the trick is to redefine CLF
but I dun't even know where that protocol is maintained. Randy? any chanse we
could have some documentation on the Common Log file Format available from
hyperreal? Perhaps there's an FAQ or an RFC or a WTF we can web up.

> robh
>

Ay.