Mailing List Archive

Extended Log File Errors
I just tried to run Analog 3.1 against an Extended Log File from my IIS4
server, and got an error that the LogFormat was incorrect. Not being too
familiar with the Extended Log Format, and bearing in mind Stephens
comments about Microsofts implementation of Extended Format, I tought
I'd do a little experiment.

I copied the 7 line sample log file from the W3 specification for the
Extended Log Format ( http://www.w3.org/TR/WD-logfile.html ) and ran
analog against that, and got the same error (I had to add a few extra
lines of data to get the "Large Number of corrupt lines" message).

Here's what I got from Analog in Debug mode:

F: Detect that it's in W3 extended format
analog: Warning C: Ignoring corrupt format line in logfile
analog: Warning C: (reason: time without date or vice versa)
analog: Warning L: Large number of corrupt lines in logfile ext.log: try
different LOGFORMAT
Current logfile format:
#Fields:<W3 extended format string>\n
#%j\n


The Logfile looks like this, and has the #Date field in the header.
#Version: 1.0
#Date: 12-Jan-1996 00:00:00
#Fields: time cs-method cs-uri
00:34:23 GET /foo/bar.html
12:21:16 GET /foo/bar.html
12:45:52 GET /foo/bar.html
12:57:34 GET /foo/bar.html
00:34:23 GET /foo/bar.html
12:21:16 GET /foo/bar.html
12:45:52 GET /foo/bar.html
12:57:34 GET /foo/bar.html
00:34:23 GET /foo/bar.html
12:21:16 GET /foo/bar.html
12:45:52 GET /foo/bar.html
12:57:34 GET /foo/bar.html

(I note that the W3C document specified a Date format of YYYY-MM-DD, but
provides a sample with a Date of 12-Jan-1996 !)

The documentation suggests that Extended Log Format should work "out of
the box", so is this a problem at my end, or is there a glitch
somewhere?

Aengus
--
Aengus@Lawlor.org (preferred) | Aengus Lawlor
who used to be alawlor@dit.ie | An bhfuil cead agam dul amok?
--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
On 12/2/98 6:06 PM Aengus Lawlor (Aengus@Lawlor.Org) wrote:

>I just tried to run Analog 3.1 against an Extended Log File from my IIS4
>server, and got an error that the LogFormat was incorrect. Not being too
>familiar with the Extended Log Format, and bearing in mind Stephens
>comments about Microsofts implementation of Extended Format, I tought
>I'd do a little experiment.
>
>I copied the 7 line sample log file from the W3 specification for the
>Extended Log Format ( http://www.w3.org/TR/WD-logfile.html ) and ran
>analog against that, and got the same error (I had to add a few extra
>lines of data to get the "Large Number of corrupt lines" message).
>
>Here's what I got from Analog in Debug mode:
>
>F: Detect that it's in W3 extended format
>analog: Warning C: Ignoring corrupt format line in logfile
>analog: Warning C: (reason: time without date or vice versa)
>analog: Warning L: Large number of corrupt lines in logfile ext.log: try
> different LOGFORMAT
> Current logfile format:
> #Fields:<W3 extended format string>\n
> #%j\n
>
>
>The Logfile looks like this, and has the #Date field in the header.
>#Version: 1.0
>#Date: 12-Jan-1996 00:00:00
>#Fields: time cs-method cs-uri
>00:34:23 GET /foo/bar.html
>12:21:16 GET /foo/bar.html
>12:45:52 GET /foo/bar.html
>12:57:34 GET /foo/bar.html
>00:34:23 GET /foo/bar.html
>12:21:16 GET /foo/bar.html
>12:45:52 GET /foo/bar.html
>12:57:34 GET /foo/bar.html
>00:34:23 GET /foo/bar.html
>12:21:16 GET /foo/bar.html
>12:45:52 GET /foo/bar.html
>12:57:34 GET /foo/bar.html
>
>(I note that the W3C document specified a Date format of YYYY-MM-DD, but
>provides a sample with a Date of 12-Jan-1996 !)
>
>The documentation suggests that Extended Log Format should work "out of
>the box", so is this a problem at my end, or is there a glitch
>somewhere?

The W3C specification is not at all specific about how to write the #Date
line. There are at least three different date formats used in that one
document and a wide amount of disagreement between different
implimentations. Microsoft always uses YYYY-MM-DD with everything digits.
WebSTAR uses #Start-Date instead of #Date and MM/DD/YY with digits in the
US but DD/MM/YY in export versions. Others use DD-MMM-YYYY with the month
in abreveated text.

I would need to see an example from your log file to know specificaly
what is going wrong with Analog.

Jason

-----------------
Jason@Summary.Net
-----------------
Dr. Seuss books . . . can be read and enjoyed on several levels. For
example, 'One Fish Two Fish, Red Fish Blue Fish' can be deconstructed
as a searing indictment of the narrow-minded binary counting system.
-- Peter van der Linden, Expert C Programming, Deep C Secrets


--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
The problem here is that the date doesn't appear on every line. Analog
requires the date and time (or neither) on every line. If you have one but
not the other it should give you a helpful warning message -- doesn't it?

I know the date's in the header, but the problem is, what if the date
changes during the logfile? A well-behaved server would probably write
the new date, but the extended format draft standard doesn't say anything
about this. One could imagine a server assuming that you can deduce the
new date when the hour flips from 23 to 00.

I have a lot of trouble with extended format. The problem is that the spec
was never finished. There's only an early draft which is much too short.

--
Stephen Turner sret1@cam.ac.uk http://www.statslab.cam.ac.uk/~sret1/
Normally: Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England
Until 12/98: Dept of Math & Stats, 585 King Edward Ave, Ottawa K1N 6N5, Canada
Microsoft: Where am I allowed to go today?

--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
One of the advantages of the Extended Log format is that it can be
smaller, because it doesn't include redundant information. (The same 10
byes of Date information occur on every single line. It's only 10 bytes
per line, but for compact logs, that might represent 15%-30% of the log
size). It seems to me that it would be reasonable for Analog to assume
that if I choose to leave out the date field, that I did it for a
reason. While it would be nice to have Analog recognize Date changes in
mid stream by recognizing new #Date fields, it would also be perfectly
reasonable to say that you will only handle one date per file, and
require a seperate logfile for each date.

I have 30 Extended Daily Log Files for the month of November. Can I
easily analyze them all at once by specifying a custom LOGFORMAT, or
will I have to create a config file specifying a start date for each
daily log file?

Aengus
--
Aengus@Lawlor.org (preferred) | Aengus Lawlor
who used to be alawlor@dit.ie | An bhfuil cead agam dul amok?


______________________________ Reply Separator _________________________________
Subject: Re: [analog-help] Extended Log File Errors
Author: analog-help@lists.isite.net at Internet
Date: 12/3/98 10:00 AM


The problem here is that the date doesn't appear on every line. Analog
requires the date and time (or neither) on every line. If you have one but
not the other it should give you a helpful warning message -- doesn't it?

I know the date's in the header, but the problem is, what if the date
changes during the logfile? A well-behaved server would probably write
the new date, but the extended format draft standard doesn't say anything
about this. One could imagine a server assuming that you can deduce the
new date when the hour flips from 23 to 00.

I have a lot of trouble with extended format. The problem is that the spec
was never finished. There's only an early draft which is much too short.

--
Stephen Turner sret1@cam.ac.uk http://www.statslab.cam.ac.uk/~sret1/
Normally: Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England
Until 12/98: Dept of Math & Stats, 585 King Edward Ave, Ottawa K1N 6N5, Canada
Microsoft: Where am I allowed to go today?

--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
On Thu, 3 Dec 1998, Aengus Lawlor wrote:

>
> One of the advantages of the Extended Log format is that it can be
> smaller, because it doesn't include redundant information. (The same 10
> byes of Date information occur on every single line. It's only 10 bytes
> per line, but for compact logs, that might represent 15%-30% of the log
> size). It seems to me that it would be reasonable for Analog to assume
> that if I choose to leave out the date field, that I did it for a
> reason. While it would be nice to have Analog recognize Date changes in
> mid stream by recognizing new #Date fields, it would also be perfectly
> reasonable to say that you will only handle one date per file, and
> require a seperate logfile for each date.
>

Well, I agree, and I probably will end up doing something like this. I'm
still not sure what to do if the date flips over from 23 to 00 without a
new #Date line though. Or from 23 back to 22?

> I have 30 Extended Daily Log Files for the month of November. Can I
> easily analyze them all at once by specifying a custom LOGFORMAT, or
> will I have to create a config file specifying a start date for each
> daily log file?
>

You can define a LOGFORMAT in which the time fields are %j'unked, but you
will lose the date/time information. Alternatively you can preprocess the
logfile before passing it to analog. Neither solution is ideal.

--
Stephen Turner sret1@cam.ac.uk http://www.statslab.cam.ac.uk/~sret1/
Normally: Statistical Laboratory, 16 Mill Lane, Cambridge CB2 1SB, England
Until 12/98: Dept of Math & Stats, 585 King Edward Ave, Ottawa K1N 6N5, Canada
Microsoft: Where am I allowed to go today?


--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
On 12/2/98 11:01 PM Jason Linhart <jason@summary.net> wrote:

>On 12/2/98 6:06 PM Aengus Lawlor (Aengus@Lawlor.Org) wrote:



>The W3C specification is not at all specific about how to write the #Date
>line.

You are correct. The document is quite specific about the format of log
file entries (YYY-MM-DD) but it doesn't say anything specific about the
format to be used in the Directives fields. But as Analog is perfectly
happy with either format, that's not really an issue, I was just being
pedantic.

>I would need to see an example from your log file to know specificaly
>what is going wrong with Analog.

I included the whole sample log file (all 15 lines) that Analog
complained about. See above. It's simply the sample Logfile included in
the W3 Working Draft document, with the data lines duplicated so that
Analog triggers an additional warning.

>Jason

>-----------------
>Jason@Summary.Net
>-----------------

Aengus
--
Aengus@Lawlor.org (preferred) | Aengus Lawlor
who used to be alawlor@dit.ie | An bhfuil cead agam dul amok?
--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------
Extended Log File Errors [ In reply to ]
On 12/3/98 12:30 PM Aengus Lawlor (Aengus@Lawlor.Org) wrote:

>>The W3C specification is not at all specific about how to write the #Date
>>line.
>
>You are correct. The document is quite specific about the format of log
>file entries (YYY-MM-DD) but it doesn't say anything specific about the
>format to be used in the Directives fields. But as Analog is perfectly
>happy with either format, that's not really an issue, I was just being
>pedantic.
>
>>I would need to see an example from your log file to know specificaly
>>what is going wrong with Analog.
>
>I included the whole sample log file (all 15 lines) that Analog
>complained about. See above. It's simply the sample Logfile included in
>the W3 Working Draft document, with the data lines duplicated so that
>Analog triggers an additional warning.

What I was trying to say is that the spec is inconsitant and incomplete.
The example given in the spec is invalid according to the best guess one
can make from other parts of the spec and is completely inconsitant with
ExLF formats used by any current server. There is no reason the example
should work, and it doesn't.

You said that you had a log file from IIS that didn't work

>I just tried to run Analog 3.1 against an Extended Log File from my IIS4
>server

which lead you to read the spec. I was asking what that log file has in
it.

I think Stephen Turner is being too strict when he says:

On 12/3/98 10:00 AM Stephen Turner (sret1@mathstat.uottawa.ca) wrote:

>The problem here is that the date doesn't appear on every line. Analog
>requires the date and time (or neither) on every line. If you have one but
>not the other it should give you a helpful warning message -- doesn't it?

Summary <http://summary.net/summary> will read Microsoft IIS logs with
the date only in the header, and it shouldn't be too difficult to get
Analog to read them as well, although it will require code changes and
not just a configuration file. I wasn't sure, from your original message,
if that was the problem you were having.

Jason

-----------------
Jason@Summary.Net
-----------------
Dr. Seuss books . . . can be read and enjoyed on several levels. For
example, 'One Fish Two Fish, Red Fish Blue Fish' can be deconstructed
as a searing indictment of the narrow-minded binary counting system.
-- Peter van der Linden, Expert C Programming, Deep C Secrets


--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to analog-help-request@lists.isite.net
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------