Mailing List Archive

LogFormat help request
I have a relatively complex Apache log format that I'm trying to get
analyzed with Analog. I have been unable to get the LogFormat
directive correct, though, so I'm hoping to receive a bit of guidance.

First off, I'm using Analog 6.0 and Apache 2.2.4

The LogFormat line from httpd.conf:

LogFormat "[%{%Y-%m-%d %H:%M:%S %Z}t] %v:%p %a:%{REMOTE_PORT}e %H %m
%Dms %s %>s %X %b %P \"%r\" \"%f\" \"%U\" \"%q\" \"%{Referer}i\" \"%
{User-Agent}i\" \"%{SSL_PROTOCOL}e\" \"%{SSL_CIPHER}e\"" aggregate_log


The DEFAULTLOGFORMAT line that I'm trying to use:

DEFAULTLOGFORMAT ( [%Y-%m-%d %h:%n:%j] %v:%j %s:%j %j %j %Tms %c %j %j
%b %j "%j" "%r" "%j" "%q" "%f" "%B" "%j" "%j" )

(from my understanding of the docs, I cannot use APACHELOGFORMAT
because I'm using the %{strftime}t time formatting above to get the
months in digits instead of Apache's default 3 letter English
abbreviation)

This is a sample of the log:

[2008-02-11 10:50:02 EST] library.dartmouth.edu:80 130.189.217.32:-
HTTP/1.1 GET 28505ms 200 200 + 270629 11428 "GET /se
arch/search360/search360.js HTTP/1.1" "/data/websites/diglib/search/
search360/search360.js" "/search/search360/search360
.js" "" "http://www.dartmouth.edu/~biomed/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322;
.NET CLR 2.0.50727)" "-" "-"
[2008-02-11 10:50:03 EST] library.dartmouth.edu:80 89.62.40.234:- HTTP/
1.1 GET 5239ms 200 200 + 10872 11429 "GET /images
/banner_purple.jpg HTTP/1.1" "/data/websites/diglib/images/
banner_purple.jpg" "/images/banner_purple.jpg" "" "http://ima
ges.google.de/images?q=purple
+banner&ie=UTF-8&oe=utf-8&rls=org.mozilla:en-
US:official&client=firefox-a&um=1&sa=N&tab=wi"
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/
20080201 Firefox/2.0.0.12" "-" "-"
[2008-02-11 10:50:04 EST] journals.dartmouth.edu:80 195.113.214.196:-
HTTP/1.0 GET 13669ms 200 200 - 6476 11432 "GET /la
tinox/interact/index.html HTTP/1.0" "/data/websites/journals/latinox/
interact/index.html" "/latinox/interact/index.html"
"" "-" "Jyxobot/1" "-" "-"
[2008-02-11 10:50:05 EST] linguistic-discovery.dartmouth.edu:80
189.131.111.254:- HTTP/1.1 GET 1375ms 302 302 + 409 1143
1 "GET / HTTP/1.1" "-" "/" "" "http://www.doaj.org/doaj?func=subject&cpid=122
" "Mozilla/4.0 (compatible; MSIE 7.0; Windo
ws NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR
3.0.04506; InfoPath.1)" "-" "-"
[2008-02-11 10:50:04 EST] library.dartmouth.edu:80
129.170.117.103:65362 HTTP/1.1 GET 454355ms 200 200 + 8405 11430 "GET
/ HTTP/1.1" "/data/websites/diglib/index.php" "/index.php" "" "-"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-U
S; rv:1.9b2) Gecko/2007121014 Firefox/3.0b2" "-" "-"

When I run Analog, with DEBUG ON, I get:

/usr/local/analog-6.0/analog: analog version 6.0/Unix
F: Closing configuration file /data/production/analog-giza/analog-
combined.cfg
F: Opening /usr/local/analog-6.0/lang/uk.lng as language file
F: Closing language file /usr/local/analog-6.0/lang/uk.lng
F: Opening /usr/local/analog-6.0/lang/ukdom.tab as domains file
F: Closing domains file /usr/local/analog-6.0/lang/ukdom.tab
F: Opening /usr/local/analog-6.0/lang/ukdesc.txt as report
descriptions file
F: Closing report descriptions file /usr/local/analog-6.0/lang/
ukdesc.txt
F: Opening /dltg/analog-giza/dnscacche as DNS input file
F: Closing DNS input file /dltg/analog-giza/dnscacche
F: Creating /usr/local/analog-6.0/dnslock as DNS lock file
F: Opening /dltg/analog-giza/dnscacche as DNS output file
F: Opening access_200802.log as logfile
C: [2008-02-11 10:50:02 EST] library.dartmouth.edu:80 130.189.217.32:-
HTTP/1.1 GET 28505ms 200 200 + 270629 11428 "GET
/search/search360/search360.js HTTP/1.1" "/data/websites/diglib/search/
search360/search360.js" "/search/search360/search
360.js" "" "http://www.dartmouth.edu/~biomed/" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.432
2; .NET CLR 2.0.50727)" "-" "-"
C: *
C: [2008-02-11 10:50:03 EST] library.dartmouth.edu:80 89.62.40.234:-
HTTP/1.1 GET 5239ms 200 200 + 10872 11429 "GET /ima
ges/banner_purple.jpg HTTP/1.1" "/data/websites/diglib/images/
banner_purple.jpg" "/images/banner_purple.jpg" "" "http://
images.google.de/images?q=purple
+banner&ie=UTF-8&oe=utf-8&rls=org.mozilla:en-
US:official&client=firefox-a&um=1&sa=N&tab=
wi" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12)
Gecko/20080201 Firefox/2.0.0.12" "-" "-"
C: *
C: [2008-02-11 10:50:04 EST] journals.dartmouth.edu:80
195.113.214.196:- HTTP/1.0 GET 13669ms 200 200 - 6476 11432 "GET
/latinox/interact/index.html HTTP/1.0" "/data/websites/journals/
latinox/interact/index.html" "/latinox/interact/index.ht
ml" "" "-" "Jyxobot/1" "-" "-"
C: *
C: [2008-02-11 10:50:05 EST] linguistic-discovery.dartmouth.edu:80
189.131.111.254:- HTTP/1.1 GET 1375ms 302 302 + 409 1
1431 "GET / HTTP/1.1" "-" "/" "" "http://www.doaj.org/doaj?func=subject&cpid=122
" "Mozilla/4.0 (compatible; MSIE 7.0; Wi
ndows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR
3.0.04506; InfoPath.1)" "-" "-"
C: *
C: [2008-02-11 10:50:04 EST] library.dartmouth.edu:80
129.170.117.103:65362 HTTP/1.1 GET 454355ms 200 200 + 8405 11430 "
GET / HTTP/1.1" "/data/websites/diglib/index.php" "/index.php" "" "-"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; e
n-US; rv:1.9b2) Gecko/2007121014 Firefox/3.0b2" "-" "-"
C: *
F: Closing logfile access_200802.log
S: Successful requests: 0
S: Redirected requests: 0
S: Failed requests: 0
S: Requests returning informational status code: 0
S: Status code not given: 0
S: Unwanted lines: 0
S: Corrupt lines: 5
F: Closing DNS output file /dltg/analog-giza/dnscacche
F: Deleting DNS lock file /usr/local/analog-6.0/dnslock
F: Opening /data/websites/giza/hoyle/analog-test/index.html as output
file
F: Closing /data/websites/giza/hoyle/analog-test/index.html


After trying to simplify the log string as much as possible, it seems
that my error is in the date/time section above. As far as I can
tell, the way to encode [2008-02-11 10:50:04 EST] so that Analog will
parse it correctly is [%Y-%m-%d %h:%n:%j], yet this doesn't seem to
work.

Any help that you can provide will be much appreciated.

Thanks,

Roberto Hoyle


+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: LogFormat help request [ In reply to ]
Roberto Hoyle <Roberto.J.Hoyle@dartmouth.edu> wrote:
>
> DEFAULTLOGFORMAT ( [%Y-%m-%d %h:%n:%j] %v:%j %s:%j %j %j %Tms %c %j %j
> %b %j "%j" "%r" "%j" "%q" "%f" "%B" "%j" "%j" )
>
>
> This is a sample of the log:
>
> [2008-02-11 10:50:02 EST] library.dartmouth.edu:80 130.189.217.32:-
> HTTP/1.1 GET 28505ms 200 200 + 270629 11428 "GET /se
> arch/search360/search360.js HTTP/1.1" "/data/websites/diglib/search/
> search360/search360.js" "/search/search360/search360
> .js" "" "http://www.dartmouth.edu/~biomed/" "Mozilla/4.0 (compatible;
> MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322;
> .NET CLR 2.0.50727)" "-" "-"

Your DEFAULTLOGFORMAT starts with a space (everything within the parentheses is important), your logfile doesn't.

If you take the spaces out at the beinning and end of the LOGFORMAT, Analog will interpret the sample lines you provided.

LOGFORMAT ([%Y-%m-%d %h:%n:%j] %v:%j %s:%j %j %j %Tms %c %j %j %b %j "%j" "%r" "%j" "%q" "%f" "%B" "%j" "%j")

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: LogFormat help request [ In reply to ]
On Feb 11, 2008, at 4:01 PM, Aengus wrote:
> Your DEFAULTLOGFORMAT starts with a space (everything within the
> parentheses is important), your logfile doesn't.
>
> If you take the spaces out at the beinning and end of the LOGFORMAT,
> Analog will interpret the sample lines you provided.
>
> LOGFORMAT ([%Y-%m-%d %h:%n:%j] %v:%j %s:%j %j %j %Tms %c %j %j %b %j
> "%j" "%r" "%j" "%q" "%f" "%B" "%j" "%j")

Thank you, that was exactly the problem.

After staring at something for too long you stop seeing the obvious...

r.
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------