Mailing List Archive

hmmm...
This might be bad form to complain about this functionality this late in
the game, but conceptually I have a hard time justifying the
two-web-log-hits effect of error response redirects. I.e., when I access
a protected area under a bogus username/password:

fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 -
fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /401.html" 200 703

The problem is that the second one, when not in the context of the first,
looks like a valid user "asdfsaf" accessed a page under authentication.
I'd have to tell my scripts "no, no, toss out all accesses to 401.html
before doing any user-based analysis".

What do people think?

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/
Re: hmmm... [ In reply to ]
> This might be bad form to complain about this functionality this late in
> the game, but conceptually I have a hard time justifying the
> two-web-log-hits effect of error response redirects. I.e., when I access
> a protected area under a bogus username/password:
>
> fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 -
> fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /401.html" 200 703
>
> The problem is that the second one, when not in the context of the first,
> looks like a valid user "asdfsaf" accessed a page under authentication.
> I'd have to tell my scripts "no, no, toss out all accesses to 401.html
> before doing any user-based analysis".
>
> What do people think?

It's more accurate. As others would say - "do your fancy stuff in perl".


rob
Re: hmmm... [ In reply to ]
On Wed, 19 Apr 1995, Rob Hartill wrote:
> > This might be bad form to complain about this functionality this late in
> > the game, but conceptually I have a hard time justifying the
> > two-web-log-hits effect of error response redirects. I.e., when I access
> > a protected area under a bogus username/password:
> >
> > fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 -
> > fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /401.html" 200 703
> >
> > The problem is that the second one, when not in the context of the first,
> > looks like a valid user "asdfsaf" accessed a page under authentication.
> > I'd have to tell my scripts "no, no, toss out all accesses to 401.html
> > before doing any user-based analysis".
> >
> > What do people think?
>
> It's more accurate. As others would say - "do your fancy stuff in perl".

Er, no, it's not more accurate. The server did *not* return a "200"
response when the 401.html file was returned. Of course I can write a
perl script to grep -v 401.html's - that's not my point. My point is
that until this feature every log file transaction *always* represented a
unique job. There was no such request "GET /401.html".

There was talk on www-talk about what to do when the server returns an
object whose canonical name was different than the request - and the
answer was that right now there's no way in the CLF to express that, it
should wait until CLFII. I.e., a "correct way" might be

fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 - "/401.html"

just like

foo - - [19/Apr/1995:01:03:05 -0700] "GET /cgi-bin/imagemap/foo?23,45 HTTP/1.0" 200 2345 "/menuitem1.html"

in response to an imagemap query once HTTP 1.1 allows us to use BASE
instead of redirects.

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/
Re: hmmm... [ In reply to ]
> > It's more accurate. As others would say - "do your fancy stuff in perl".
>
> Er, no, it's not more accurate. The server did *not* return a "200"

It is more accurate in that it's telling you that there was a redirect,
from one page to another. The 200/401 looks like a little lie though.
That's because at the last minute Apache changes to a 401 status
response, but doesn't log it as such.
It's probably fixable by changing "status" to AUTH_REQUIRED at the
same time (in begin_http_header() ).

robh
Re: hmmm... [ In reply to ]
> Erk. Weeel, it looks like this is really not 1.3R compliant behaviour, so
> does it pass the backward-compatibility test? If no, then we should
> unpatch and make 0.6.2, or document it and put a support/apache2common
> script somewhere.
>
> The script is the simpler option (for us), but probably not the best
> option. Whatever form CLFII takes, and whether or not Apache adopts it
> in favour of CLF or ApacheLF (with optional frosting), it looks like
> we've jumped the gun.
>
> So who's got scripts that'd die tomorrow if this behaviour was dropped?

Doesn't CLF define only the format of output, not the meaning.
If so, Apache and NCSA 1.4 aren't doing anything wrong. It's up to
the server to decide how things get logged. No ?

robh
Re: hmmm... [ In reply to ]
On Wed, 19 Apr 1995, Andrew Wilson wrote:
> Erk. Weeel, it looks like this is really not 1.3R compliant behaviour, so
> does it pass the backward-compatibility test? If no, then we should
> unpatch and make 0.6.2, or document it and put a support/apache2common
> script somewhere.

No, I think for now we should just not log the second response, the "real
object" that was returned. For 401 access for example, I know that
401.html will always be returned, so logging it is redundant.

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/
Re: hmmm... [ In reply to ]
> On Wed, 19 Apr 1995, Andrew Wilson wrote:
> > Erk. Weeel, it looks like this is really not 1.3R compliant behaviour, so
> > does it pass the backward-compatibility test? If no, then we should
> > unpatch and make 0.6.2, or document it and put a support/apache2common
> > script somewhere.
>

Brian responded

> No, I think for now we should just not log the second response, the "real
> object" that was returned. For 401 access for example, I know that
> 401.html will always be returned, so logging it is redundant.

What about the other error/problem redirects, and regular redirects ?
should we not log those too ?

Is it also redundant to log
/missing
when I know that /missing/ is almost sure to follow a second
later ?


You seem to be saying there's too much information there. I'd
disagree. If some of the info isn't of interest to you, ignore it.


robh
Re: hmmm... [ In reply to ]
[loggin redirected URLs...]

> Er, no, it's not more accurate. The server did *not* return a "200"
> response when the 401.html file was returned. Of course I can write a
> perl script to grep -v 401.html's - that's not my point. My point is
> that until this feature every log file transaction *always* represented a
> unique job. There was no such request "GET /401.html".
>
> There was talk on www-talk about what to do when the server returns an
> object whose canonical name was different than the request - and the
> answer was that right now there's no way in the CLF to express that, it
> should wait until CLFII. I.e., a "correct way" might be
>
> fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 - "/401.html"
>
> just like
>
> foo - - [19/Apr/1995:01:03:05 -0700] "GET /cgi-bin/imagemap/foo?23,45 HTTP/1.0" 200 2345 "/menuitem1.html"
>
> in response to an imagemap query once HTTP 1.1 allows us to use BASE
> instead of redirects.

Erk. Weeel, it looks like this is really not 1.3R compliant behaviour, so
does it pass the backward-compatibility test? If no, then we should
unpatch and make 0.6.2, or document it and put a support/apache2common
script somewhere.

The script is the simpler option (for us), but probably not the best
option. Whatever form CLFII takes, and whether or not Apache adopts it
in favour of CLF or ApacheLF (with optional frosting), it looks like
we've jumped the gun.

So who's got scripts that'd die tomorrow if this behaviour was dropped?

> Brian


Ay.

Andrew Wilson URL: http://www.cm.cf.ac.uk/User/Andrew.Wilson/
Elsevier Science, Oxford Office: +44 01865 843155 Mobile: +44 0589 616144
Re: hmmm... [ In reply to ]
> Doesn't CLF define only the format of output, not the meaning.
> If so, Apache and NCSA 1.4 aren't doing anything wrong. It's up to
> the server to decide how things get logged. No ?

I've not seen the CLF documentation (if there is any), where's it kept?

> robh
>

Ay.
Re: hmmm... [ In reply to ]
> Brian responded
>
> > No, I think for now we should just not log the second response, the "real
> > object" that was returned. For 401 access for example, I know that
> > 401.html will always be returned, so logging it is redundant.
>
> What about the other error/problem redirects, and regular redirects ?
> should we not log those too ?
>
> Is it also redundant to log
> /missing
> when I know that /missing/ is almost sure to follow a second
> later ?
>
>
> You seem to be saying there's too much information there. I'd
> disagree. If some of the info isn't of interest to you, ignore it.

This is where I'm sitting too. There just isn't enough information in
the logs to be of any damned use to anyone. The current state of affairs
(which Brian objects to) isn't really what I'm looking for, and seems,
in all fairness, to be a kludge. It's feasible to do per-user
tracking (buisness & marketing types 'lerrrve' this stuff) by analysing a
more complete logfile, but we're not there yet.

Is it too much work to add a support/apache2common script? Or would other
behaviour break if we removed the patch that added the additional log-entry?

Buh.

> robh
>

Ay.

Andrew Wilson URL: http://www.cm.cf.ac.uk/User/Andrew.Wilson/
Elsevier Science, Oxford Office: +44 01865 843155 Mobile: +44 0589 616144
Re: hmmm... [ In reply to ]
> This might be bad form to complain about this functionality this late in
> the game, but conceptually I have a hard time justifying the
> two-web-log-hits effect of error response redirects. I.e., when I access
> a protected area under a bogus username/password:
>
> fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 -
> fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /401.html" 200 703
>
> The problem is that the second one, when not in the context of the first,
> looks like a valid user "asdfsaf" accessed a page under authentication.
> I'd have to tell my scripts "no, no, toss out all accesses to 401.html
> before doing any user-based analysis".
>
> What do people think?

I think this is a bug. The common logfile format was intended to
log user requests, not server actions. Saying that the server received
a request "GET /401.html" is lying.

......Roy
584 messages down, 1109 to go (geez, if I could just get people
to stop work for a while...)
Re: hmmm... [ In reply to ]
On Sat, 6 May 1995, Roy T. Fielding wrote:
> > fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /Login/ HTTP/1.0" 401 -
> > fully - asdfsaf [19/Apr/1995:01:03:05 -0700] "GET /401.html" 200 703
>
> I think this is a bug. The common logfile format was intended to
> log user requests, not server actions. Saying that the server received
> a request "GET /401.html" is lying.

Right. I think we left the consensus at something like this:

The CLF is too restrictive, as the object requested may be significantly
different than the one received, so we need a way to specify both in the
logfile. This is true for many actions, multiviews for example - just
looking at logfiles won't tell me how many mother.html's versus
mother.html3's were serviced. However, it was deemed well and good to
still log server internal redirects while waiting for CLF2, so *for*now*
(and it's well documented, supposedly) that redirect is logged as a
separate, but concurrent, access. Maybe our next order of business
should be working on CLF2 (and CGI 1.2, and... :)

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/