Mailing List Archive

LOGFORMAT question
Hi analog folks,

Here is my current logformat string and two example log lines:

(%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %B %v)

24-247-100-8.dhcp.aldl.mi.charter.com - - [30/Dec/2007:00:00:07 -0500] "(GET
/kermit/postal-ca.html HTTP/1.1)" 200 9449 "(ref
http://search.yahoo.com/search;_ylt=A0geu_AQJXdHVYsAap1XNyoA?p=montreal%2C+q
uebec+postal+codes&fr=yfp-t-501&ei=UTF-8)" "(client Mozilla/4.0 (compatible;
MSIE 7.0; Windows NT 5.1; SU 3.005; .NET CLR 1.1.4322; HbTools 4.8.4;
InfoPath.2))"

c01.ba.accelovation.com - - [30/Dec/2007:00:00:10 -0500] "(GET
/edit_entry.php?area=63&room=67&hour=16&minute=30&year=2007&month=12&day=06
HTTP/1.0)" 302 2976 "(ref
http://meeting.cc.columbia.edu/day.php?year=2007&month=12&day=06&area=63)"
"(client Mozilla/5.0 (compatible;
heritrix/1.12.0+http://www.accelobot.com))" "vhost meeting.cc.columbia.edu"

Some lines have a virtual host and some do not.... They're all intermixed
in the logfiles.

Here is a link to the report:

http://www.columbia.edu/~jf2412/report/

(it's big and may take a long time to load.)

Things start to go south around the Browser report... It's still reporting
keywords... Then, if you go to the Virtual Host report, there's the
browsers.. And then the virtual host redirect report.. Same deal..

Any suggestions on how to rejigger my LOGFORMAT to get this right?

TIA,

Joshua





--
Joshua S. Freeman
Director- CUIT Interactive Services
o: 212.854.2083 | m: 347.392.2560
Skype/YIM: karmester | Skype-In: 914.613.3132


+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: LOGFORMAT question [ In reply to ]
Joshua S. Freeman <jf2412@columbia.edu> wrote:
> Hi analog folks,
>
> Here is my current logformat string and two example log lines:
>
> (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %B %v)
>
> 24-247-100-8.dhcp.aldl.mi.charter.com - - [30/Dec/2007:00:00:07
> -0500] "(GET /kermit/postal-ca.html HTTP/1.1)" 200 9449 "(ref
> http://search.yahoo.com/search;_ylt=A0geu_AQJXdHVYsAap1XNyoA?p=montreal%2C+q
> uebec+postal+codes&fr=yfp-t-501&ei=UTF-8)" "(client Mozilla/4.0
> (compatible; MSIE 7.0; Windows NT 5.1; SU 3.005; .NET CLR 1.1.4322;
> HbTools 4.8.4; InfoPath.2))"
>
> c01.ba.accelovation.com - - [30/Dec/2007:00:00:10 -0500] "(GET
> /edit_entry.php?area=63&room=67&hour=16&minute=30&year=2007&month=12&day=06
> HTTP/1.0)" 302 2976 "(ref
> http://meeting.cc.columbia.edu/day.php?year=2007&month=12&day=06&area=63)"
> "(client Mozilla/5.0 (compatible;
> heritrix/1.12.0+http://www.accelobot.com))" "vhost
> meeting.cc.columbia.edu"
>

Your referrer, browser and vhost fields are all preceded by an identifier and delimited by quotes. Your logformat delimits the fields by spaces, and there are lots of spaces in your Browser field, so you get chunks of the browser string in your reports.

You need two LOGFORMAT strings to deal with the fact that only some of your entries have a VHost entry:

LOGFORMAT (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b "%j %f" "%j %B")
LOGFORMAT (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b "%j %f" "%j %B" "%j %v")

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: LOGFORMAT question [ In reply to ]
I admit to not having tried it, but I think you need two LOGFORMAT
lines to match the two different formats, and they should look
something like this:

LOGFORMAT '%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b "(ref %f)"
"(client %B)" "vhost %v"'
LOGFORMAT '%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b "(ref %f)"
"(client %B)"'

They have to go in that order. I put the arguments in single quotes
because they already contain double quotes and parentheses within
them.

--
Stephen Turner


2008/5/2 Joshua S. Freeman <jf2412@columbia.edu>:
> Hi analog folks,
>
> Here is my current logformat string and two example log lines:
>
> (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %B %v)
>
> 24-247-100-8.dhcp.aldl.mi.charter.com - - [30/Dec/2007:00:00:07 -0500] "(GET
> /kermit/postal-ca.html HTTP/1.1)" 200 9449 "(ref
> http://search.yahoo.com/search;_ylt=A0geu_AQJXdHVYsAap1XNyoA?p=montreal%2C+q
> uebec+postal+codes&fr=yfp-t-501&ei=UTF-8)" "(client Mozilla/4.0 (compatible;
> MSIE 7.0; Windows NT 5.1; SU 3.005; .NET CLR 1.1.4322; HbTools 4.8.4;
> InfoPath.2))"
>
> c01.ba.accelovation.com - - [30/Dec/2007:00:00:10 -0500] "(GET
> /edit_entry.php?area=63&room=67&hour=16&minute=30&year=2007&month=12&day=06
> HTTP/1.0)" 302 2976 "(ref
> http://meeting.cc.columbia.edu/day.php?year=2007&month=12&day=06&area=63)"
> "(client Mozilla/5.0 (compatible;
> heritrix/1.12.0+http://www.accelobot.com))" "vhost meeting.cc.columbia.edu"
>
> Some lines have a virtual host and some do not.... They're all intermixed
> in the logfiles.
>
> Here is a link to the report:
>
> http://www.columbia.edu/~jf2412/report/
>
> (it's big and may take a long time to load.)
>
> Things start to go south around the Browser report... It's still reporting
> keywords... Then, if you go to the Virtual Host report, there's the
> browsers.. And then the virtual host redirect report.. Same deal..
>
> Any suggestions on how to rejigger my LOGFORMAT to get this right?
>
> TIA,
>
> Joshua
>
>
>
>
>
> --
> Joshua S. Freeman
> Director- CUIT Interactive Services
> o: 212.854.2083 | m: 347.392.2560
> Skype/YIM: karmester | Skype-In: 914.613.3132
>
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------