Mailing List Archive

multiple domains one log file
I have one access log that contains information for multiple virtual
domains. Can someone help me out with the parameter that will allow me
to pull out only the information for one (or two domains)?



Thanks,


Steve
-----------------------------------
Steve Hildreth
Office: 213-241-1691
Cell: 213-215-8195
steve.hildreth@lausd.net <blocked::mailto:steve.hildreth@lausd.net>
Re: multiple domains one log file [ In reply to ]
Hildreth, Steve <steve.hildreth@lausd.net> wrote:
> I have one access log that contains information for multiple virtual
> domains. Can someone help me out with the parameter that will allow
> me to pull out only the information for one (or two domains)?

If each line of the log file includes a field that indicates which "VHOST" (in Analog terminology) that the entry belongs to, and your LOGFORMAT as %v in that position, then VHOSTINCLUDE will tell Analog to only generate a report for the entries that match the specified VHOSTs.

For example:

1.2.3.4 - - server1 [2007/Mar/2008:00:55:43 -0500] "GET /index.html HTTP/1.1" 200

LOGFORMAT (%S %u %j %v [%d/%M/%Y:%h:%n:%t %j] "%j %r %j" %c)


VHOSTINCLUDE server1

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
RE: multiple domains one log file [ In reply to ]
I attempted to implement the parameter below and was not successful.
Let me include a couple line from a log file to make sure I correctly
explained what I am asking. Also, I am using analog 5.32.

What I am trying to report on are all log entries that are for the
domain www.lastudentscount.org while excluding all log entries for the
domain www.lausd.net.


10.82.26.193 - - [04/Mar/2008:10:05:27 -0800] "GET
/Clinton_MS/work_files/compass.gif HTTP/1.1" 304 -
"http://www.lausd.net/Clinton_MS/" "Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"

204.108.65.10 - - [04/Mar/2008:10:10:18 -0800] "GET /images/LAUSDmap.jpg
HTTP/1.1" 200 441023 "http://www.lastudentscount.org/aboutlausd.html"
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR
1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)"

Thanks,


Steve
-----------------------------------
Steve Hildreth
Office: 213-241-1691
Cell: 213-215-8195
steve.hildreth@lausd.net


-----Original Message-----
From: analog-help-bounces@lists.meer.net
[mailto:analog-help-bounces@lists.meer.net] On Behalf Of Aengus
Sent: Tuesday, March 04, 2008 9:43 AM
To: Support for analog web log analyzer
Subject: Re: [analog-help] multiple domains one log file

Hildreth, Steve <steve.hildreth@lausd.net> wrote:
> I have one access log that contains information for multiple virtual
> domains. Can someone help me out with the parameter that will allow
> me to pull out only the information for one (or two domains)?

If each line of the log file includes a field that indicates which
"VHOST" (in Analog terminology) that the entry belongs to, and your
LOGFORMAT as %v in that position, then VHOSTINCLUDE will tell Analog to
only generate a report for the entries that match the specified VHOSTs.

For example:

1.2.3.4 - - server1 [2007/Mar/2008:00:55:43 -0500] "GET /index.html
HTTP/1.1" 200

LOGFORMAT (%S %u %j %v [%d/%M/%Y:%h:%n:%t %j] "%j %r %j" %c)


VHOSTINCLUDE server1

Aengus

+-----------------------------------------------------------------------
-
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+-----------------------------------------------------------------------
-

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: multiple domains one log file [ In reply to ]
There is nothing in those logfile lines to distinguish which host each
one is from. The URLs are referrers -- where the visitor followed a
link from -- which are often internal to the site, but you will also
have external referrers, and you may even have links pointing from one
site to the other.

You need to configure your web server either to log each host to a
separate file, or to identify the host on each line. Without one of
those things there is nothing analog or any other program can do to
sort out which requests relate to which host.

--
Stephen Turner
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: multiple domains one log file [ In reply to ]
Hildreth, Steve <steve.hildreth@lausd.net> wrote:
> I attempted to implement the parameter below and was not successful.
> Let me include a couple line from a log file to make sure I correctly
> explained what I am asking. Also, I am using analog 5.32.
>
> What I am trying to report on are all log entries that are for the
> domain www.lastudentscount.org while excluding all log entries for the
> domain www.lausd.net.
>
>
> 10.82.26.193 - - [04/Mar/2008:10:05:27 -0800] "GET
> /Clinton_MS/work_files/compass.gif HTTP/1.1" 304 -
> "http://www.lausd.net/Clinton_MS/" "Mozilla/4.0 (compatible; MSIE 6.0;
> Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR
> 2.0.50727)"
>
> 204.108.65.10 - - [04/Mar/2008:10:10:18 -0800] "GET
> /images/LAUSDmap.jpg HTTP/1.1" 200 441023
> "http://www.lastudentscount.org/aboutlausd.html" "Mozilla/4.0
> (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR
> 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)"

Okay, there's nothing in those log entries to indicate that they are from different "virtual domains". The 2 entries in the middle that start with http:// are referrer fields - I'm pretty sure that if you look at your log files closely, you'll find http://www.google.com showing up in that field on a regular basis, and I'm pretty sure you don't have a "virtual domain" called www.google.com :-)

You'll have to modify your web servers log settings to do what you want to do. Right now, it's just logging all requests for both servers into a single logfile, without indicating which entries belong to which server.

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
RE: multiple domains one log file [ In reply to ]
Thanks all for the clarification and information. I'll work with the
System Admins to implement the correct logging.


Steve
-----------------------------------
Steve Hildreth
Office: 213-241-1691
Cell: 213-215-8195
steve.hildreth@lausd.net


-----Original Message-----
From: analog-help-bounces@lists.meer.net
[mailto:analog-help-bounces@lists.meer.net] On Behalf Of Aengus
Sent: Tuesday, March 04, 2008 10:31 AM
To: Support for analog web log analyzer
Subject: Re: [analog-help] multiple domains one log file

Hildreth, Steve <steve.hildreth@lausd.net> wrote:
> I attempted to implement the parameter below and was not successful.
> Let me include a couple line from a log file to make sure I correctly
> explained what I am asking. Also, I am using analog 5.32.
>
> What I am trying to report on are all log entries that are for the
> domain www.lastudentscount.org while excluding all log entries for the
> domain www.lausd.net.
>
>
> 10.82.26.193 - - [04/Mar/2008:10:05:27 -0800] "GET
> /Clinton_MS/work_files/compass.gif HTTP/1.1" 304 -
> "http://www.lausd.net/Clinton_MS/" "Mozilla/4.0 (compatible; MSIE 6.0;
> Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR
> 2.0.50727)"
>
> 204.108.65.10 - - [04/Mar/2008:10:10:18 -0800] "GET
> /images/LAUSDmap.jpg HTTP/1.1" 200 441023
> "http://www.lastudentscount.org/aboutlausd.html" "Mozilla/4.0
> (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR
> 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)"

Okay, there's nothing in those log entries to indicate that they are
from different "virtual domains". The 2 entries in the middle that start
with http:// are referrer fields - I'm pretty sure that if you look at
your log files closely, you'll find http://www.google.com showing up in
that field on a regular basis, and I'm pretty sure you don't have a
"virtual domain" called www.google.com :-)

You'll have to modify your web servers log settings to do what you want
to do. Right now, it's just logging all requests for both servers into a
single logfile, without indicating which entries belong to which server.

Aengus

+-----------------------------------------------------------------------
-
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+-----------------------------------------------------------------------
-

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: multiple domains one log file [ In reply to ]
Salaam!

Hildreth, Steve wrote:

> Thanks all for the clarification and information. I'll work
> with the System Admins to implement the correct logging.

I'm working with a server that does not have any provision for
changing the logging format other than to turn it on or off. It does
not indicate virtual domains.

In addition, most servers support either the "Common Log Format"
or the "Extended Common Log Format" or both. Neither of these has a
provision for "Virtual host." See
http://www.muslimamerica.us/aj/eclf.htm for an explanation of these
log formats.

Here's a possible solution:

Create under each virtual root one folder that identifies the
virtual domain. Make these unique names. Then each GET, POST, or
other request will show as "GET /<identifier>/path/object" and you can
instruct Analog to include only lines with the particular identifier
you want. You'll have to figure out how to eliminate lines on domain
"a" that include referrers from domain "a" but actually are on domain
"b" but I'm sure you can work that out.

This will mean that each virtual root will contain only one item,
which is the identifying subfolder under which is everything that was
formerly under the virtual root. All the links, everywhere on the
Web, that point to anything on your sites, will be broken, UNLESS your
server keeps track of such changes and refers the requests to the new
location (i.e., "under the identifier").

There is one other possibility, but I don't know how to implement
it. This would involve using "authentication" in an unusual way.

The third field in both the Common Log Format and the ECLF is
"authuser." This is only shown when the client is required to log in
with a username and password. If you can figure out how to
"auto-authenticate" visitors with a script of some sort in the virtual
root of each domain, then this third field can contain the name of the
virtual domain, and you can configure Analog to use it to separate the
logs. (The "username" and "password" auto-entered on the client side
will persist until the client replaces it with another or restarts his
browser.) Usually the visitor is presented with a popup box demanding
a username and password ~ if you can write the code to fill in those
blanks and return the challenge transparently, then this will fill in
that third field in the log lines.

And yes, I would be very interested in obtaining such a script.
Meanwhile, I'll present the problem to my youngest son, who is much
more likely to know how to do it than I am.

> Steve

was-salaam,
abujamal
--
astaghfirullahal-ladhee laa ilaha illa
howal-hayyul-qayyoom wa 'atoobu 'ilaihi

Rejoice, muslims, in martyrdom without fighting,
a Mercy for us. Be like the better son of Adam.

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------