Mailing List Archive

1and1 logfile format
Has anyone any experience of processing 1and1 logfiles through analog? They
seem to use an unusual format that analog doesn't recognise. The record
layout seems to be:

Requestor IP
Username
Date/time
"GET requested URL inc HTTP version"
HTTP response
Bytes
Domain to which the request refers <-- This seems to be something particular
to 1and1 and is unquoted
"Referrer"
"User agent"
"-" <-- No idea what this is meant to be!

So for example:

124.115.0.145 - - [07/Sep/2009:00:24:08 +0200] "GET / HTTP/1.1" 200 17616
www.mydomain.com "http://www.mydomain.com/" "Mozilla/4.0 (compatible; MSIE
6.0; Windows NT 5.1)" "-"

I'm presuming a LOGFORMAT would fix it but no luck. I tried:

LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b %j %f %B %j)

Which when matched up with the log record wold read as:

%s - Requestor IP
-
%u - Username
[%d/%M/%Y:%h:%n:%j %j] - Date/time
%j - GET
%r - requested URL
%j - HTTP version
%c - HTTP response
%b - Bytes
%j - Domain to which the request refers
%f - Referrer
%B - Browser/user agent
%j - "-"

Strangely, despite the LOGFORMAT command <errors.txt> reports:

analog: Warning F: Can't auto-detect format of logfile
Logs\20090907.log: ignoring it

Any ideas?

Many thanks.../Iain
RE: 1and1 logfile format [ In reply to ]
Well I've cure one issue but found another...

By placing my LOGFORMAT ahead of the LOGFILE directive I've produced a
report, but the issues I have are:

1. This odd multi-host combined logfile with it would seem multiple domains
owned by a single 1and1 account in the one logfile, and

2. The search engine queries are lost and <errors.txt> reports:

analog: Warning R: Turning off empty Search Query Report
analog: Warning R: Turning off empty Search Word Report

But...I can find the various Google and other referrals with queries so
don't yet understand why the search word reports are empty and so switched
off.

So if anyone has experience of 1and1 logfile format I'd be very grateful :-)

Many thanks.../Iain


> _____________________________________________
> From: Iain Hunneybell [mailto:iain@ipmarketing.co.uk]
> Sent: 02 November 2009 15:17
> To: 'Support for analog web log analyzer'
> Subject: 1and1 logfile format
>
> Has anyone any experience of processing 1and1 logfiles through analog?
> They seem to use an unusual format that analog doesn't recognise. The
> record layout seems to be:
>
> Requestor IP
> Username
> Date/time
> "GET requested URL inc HTTP version"
> HTTP response
> Bytes
> Domain to which the request refers <-- This seems to be something
> particular to 1and1 and is unquoted
> "Referrer"
> "User agent"
> "-" <-- No idea what this is meant to be!
>
> So for example:
>
> 124.115.0.145 - - [07/Sep/2009:00:24:08 +0200] "GET / HTTP/1.1" 200 17616
> www.mydomain.com "http://www.mydomain.com/" "Mozilla/4.0 (compatible; MSIE
> 6.0; Windows NT 5.1)" "-"
>
> I'm presuming a LOGFORMAT would fix it but no luck. I tried:
>
> LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b %j %f %B %j)
>
> Which when matched up with the log record wold read as:
>
> %s - Requestor IP
> -
> %u - Username
> [%d/%M/%Y:%h:%n:%j %j] - Date/time
> %j - GET
> %r - requested URL
> %j - HTTP version
> %c - HTTP response
> %b - Bytes
> %j - Domain to which the request refers
> %f - Referrer
> %B - Browser/user agent
> %j - "-"
>
> Strangely, despite the LOGFORMAT command <errors.txt> reports:
>
> analog: Warning F: Can't auto-detect format of logfile
> Logs\20090907.log: ignoring it
>
> Any ideas?
>
> Many thanks.../Iain
Re: 1and1 logfile format [ In reply to ]
Don't use LOGFORMAT, rather APACHELOGFORMAT

APACHELOGFORMAT (%h %l %u %t \"%r\" %s %b %{Host}i \"%{Referer}i\"
\"%{User-agent}i\" "%j")

works just fine with 1&1 here, with the line as the first one in the
CFG file. (remove the CR so it is all on one line).

Dave

On 2 Nov 2009 at 15:16, Iain Hunneybell wrote:

> Has anyone any experience of processing 1and1 logfiles through analog?
> They seem to use an unusual format that analog doesn't recognise. The
> record layout seems to be:
>
> Requestor IP
> Username
> Date/time
> "GET requested URL inc HTTP version"
> HTTP response
> Bytes
> Domain to which the request refers <-- This seems to be something
> particular to 1and1 and is unquoted "Referrer" "User agent" "-" <-- No
> idea what this is meant to be!
>
> So for example:
>
> 124.115.0.145 - - [07/Sep/2009:00:24:08 +0200] "GET / HTTP/1.1" 200
> 17616 www.mydomain.com "http://www.mydomain.com/" "Mozilla/4.0
> (compatible; MSIE 6.0; Windows NT 5.1)" "-"
>
> I'm presuming a LOGFORMAT would fix it but no luck. I tried:
>
> LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b %j %f %B %j)
>
> Which when matched up with the log record wold read as:
>
> %s - Requestor IP
> -
> %u - Username
> [%d/%M/%Y:%h:%n:%j %j] - Date/time
> %j - GET
> %r - requested URL
> %j - HTTP version
> %c - HTTP response
> %b - Bytes
> %j - Domain to which the request refers
> %f - Referrer
> %B - Browser/user agent
> %j - "-"
>
> Strangely, despite the LOGFORMAT command <errors.txt> reports:
>
> analog: Warning F: Can't auto-detect format of logfile
> Logs\20090907.log: ignoring it
>
> Any ideas?
>
> Many thanks.../Iain
>


http://www.davesergeant.com

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: RE: 1and1 logfile format [ In reply to ]
LOGFORMAT (or APACHELOGFORMAT) must precede LOGFILE. It defines the log file
format used for subsequest LOGFILE commands. See
http://analog.cx/docs/logfmt.html.

You can use %v in Analog's LOGFORMAT command to define the virtual host that
the log line refers to. Then you can use the Virtual Host report to view
cross-host comparisons or VHOSTEXCLUDE to constrain a report to a single
virtual host.

The errors.txt file is meant for human consumption and provides no
additional details (from an analytics standpoint) than are already listed in
the access.log. You can skip the file completely. See
http://analog.cx/docs/faq.html#faq173

--
Jeremy Wadsack


On Mon, Nov 2, 2009 at 7:53 AM, Iain Hunneybell <iain@ipmarketing.co.uk>wrote:

> Well I've cure one issue but found another...
>
> By placing my LOGFORMAT ahead of the LOGFILE directive I've produced a
> report, but the issues I have are:
>
> 1. This odd multi-host combined logfile with it would seem multiple domains
> owned by a single 1and1 account in the one logfile, and
>
> 2. The search engine queries are lost and <errors.txt> reports:
>
> analog: Warning R: Turning off empty Search Query Report
> analog: Warning R: Turning off empty Search Word Report
>
> But...I can find the various Google and other referrals with queries so
> don't yet understand why the search word reports are empty and so switched
> off.
>
> So if anyone has experience of 1and1 logfile format I'd be very grateful
> :-)
>
> Many thanks.../Iain
>
> _____________________________________________
> *From: * Iain Hunneybell [*mailto:iain@ipmarketing.co.uk*<iain@ipmarketing.co.uk>]
>
> *Sent: * 02 November 2009 15:17
> *To: * 'Support for analog web log analyzer'
> *Subject: * 1and1 logfile format
>
> Has anyone any experience of processing 1and1 logfiles through analog? They
> seem to use an unusual format that analog doesn't recognise. The record
> layout seems to be:
>
> Requestor IP
> Username
> Date/time
> "GET requested URL inc HTTP version"
> HTTP response
> Bytes
> Domain to which the request refers <-- This seems to be something
> particular to 1and1 and is unquoted
> "Referrer"
> "User agent"
> "-" <-- No idea what this is meant to be!
>
> So for example:
>
> 124.115.0.145 - - [07/Sep/2009:00:24:08 +0200] "GET / HTTP/1.1" 200 17616
> *www.mydomain.com* "*http://www.mydomain.com/* <http://www.mydomain.com/>"
> "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-"
>
> I'm presuming a LOGFORMAT would fix it but no luck. I tried:
>
> LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b %j %f %B %j)
>
> Which when matched up with the log record wold read as:
>
> %s - Requestor IP
> -
> %u - Username
> [%d/%M/%Y:%h:%n:%j %j] - Date/time
> %j - GET
> %r - requested URL
> %j - HTTP version
> %c - HTTP response
> %b - Bytes
> %j - Domain to which the request refers
> %f - Referrer
> %B - Browser/user agent
> %j - "-"
>
> Strangely, despite the LOGFORMAT command <errors.txt> reports:
>
> analog: Warning F: Can't auto-detect format of logfile
> Logs\20090907.log: ignoring it
>
> Any ideas?
>
> Many thanks.../Iain
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>
>
Re: RE: 1and1 logfile format [ In reply to ]
Iain Hunneybell <iain@ipmarketing.co.uk> wrote:
>> Well I've cure one issue but found another...
>>
>> By placing my LOGFORMAT ahead of the LOGFILE directive I've produced
>> a report, but the issues I have are:
>>
>> 1. This odd multi-host combined logfile with it would seem multiple
>> domains owned by a single 1and1 account in the one logfile, and

As Jeremy pointed out, you can use the %v field to attribute each line to a particular VHost, by using the VHOSTINCLUDE command to generate a report for just a single virtual Host. (If you have 10 virtual hosts, you'll have to run Analog 10 times to create a seperate report for each VHost, though).

>> 2. The search engine queries are lost and <errors.txt> reports:
>>
>> analog: Warning R: Turning off empty Search Query Report
>> analog: Warning R: Turning off empty Search Word Report
>>
>> But...I can find the various Google and other referrals with queries
>> so don't yet understand why the search word reports are empty and so
>> switched off.

Your sample line has quotes around the Referrer string, but your logformat doesn't, so the " becomes part of the referrer string, and none of the SEARCHENEGINE entries start with ". You have the same problem with the Browser field.

Try this LOGFORMAT instead:
LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] "%j %r %j" %c %b %v "%f" "%B" %j)

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------