Mailing List Archive

Reporting on elements in the request line
Hi

I've got log files from a Google Search appliance. The data is located in
the "request" value.

192.0.0.0 - - [05/Nov/2009:14:50:49 -0500] "GET
/search?coutput=json&q=red+hat&btnG=Google+Search&access=p&client=default_frontend&sort=date%3AD%3AL%3Ad1&entqr=3&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection&ip=192.0.0.0&num=100&filter=0&output=xml
HTTP/1.1" 200 77014 438 0.12

I'd like to be able to report off of the values such as "&client" and
"&site" and "&q".

Is there anyway to do that?

For example, I'd love to be able to have a report that would show how many
requests had "&site=default_collection" and then see the breakdown for any
other values for &site=.

Thanks
Terry
Re: Reporting on elements in the request line [ In reply to ]
The Internal Search Word Report will give you a break down by phrases for
one or more base urls. See http://analog.cx/docs/args.html#SEARCHENGINE for
details on the command.

You can also use ARGSINCLUDE command (
http://analog.cx/docs/args.html#ARGSINCLUDE) if you just want to look at
certain arguments on a given URL. This breaks down the arguments for the URL
in the Request Report.

--
Jeremy Wadsack


On Tue, Nov 10, 2009 at 6:46 AM, Terry Chambers <terry.chambers@gmail.com>wrote:

> Hi
>
> I've got log files from a Google Search appliance. The data is located in
> the "request" value.
>
> 192.0.0.0 - - [05/Nov/2009:14:50:49 -0500] "GET
> /search?coutput=json&q=red+hat&btnG=Google+Search&access=p&client=default_frontend&sort=date%3AD%3AL%3Ad1&entqr=3&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection&ip=192.0.0.0&num=100&filter=0&output=xml
> HTTP/1.1" 200 77014 438 0.12
>
> I'd like to be able to report off of the values such as "&client" and
> "&site" and "&q".
>
> Is there anyway to do that?
>
> For example, I'd love to be able to have a report that would show how many
> requests had "&site=default_collection" and then see the breakdown for any
> other values for &site=.
>
> Thanks
> Terry
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>
>
Re: Reporting on elements in the request line [ In reply to ]
Jeremy Wadsack <jeremy.wadsack@...> writes:

>
> The Internal Search Word Report will give you a break down by phrases for one
>or more base urls. See http://analog.cx/docs/args.html#SEARCHENGINE for details
>on the command.You can also use ARGSINCLUDE command
>(http://analog.cx/docs/args.html#ARGSINCLUDE) if you just want to look at
>certain arguments on a given URL. This breaks down the arguments for the URL in
>the Request Report.--Jeremy Wadsack


Hi Jeremy - the Search reports don't see to work. I am guessing that is because
the URLs are in the "request" and they are not "referrer" values.

I don't see any documentation on how to explicitly report on certain arguments.

Would it be:

ARGSINCLUDE /search site, q, client

or is there some other format?

Thanks
Terry


+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Re: Reporting on elements in the request line [ In reply to ]
Sorry, you're right, the ARGSINCLUDE just says which page to break out into
arguments. You can't select arguments from the list.

The Search Engine reports looks at the referrer. The Internal Search Engine
reports look at the request data.

So from the example in the docs in that link below you would use something
like this:

INTSEARCHENGINE /search q

You may have to run separate reports or maybe just extra lines to add the
'client' and 'site' parameters. Test it out and see what happens.

Note that Analog does not do multivariate reports so you won't be able to
get a report of site parameter by client, for example.

--
Jeremy Wadsack


On Tue, Nov 10, 2009 at 10:44 AM, Terry Chambers
<terry.chambers@gmail.com>wrote:

> Jeremy Wadsack <jeremy.wadsack@...> writes:
>
> >
> > The Internal Search Word Report will give you a break down by phrases for
> one
> >or more base urls. See http://analog.cx/docs/args.html#SEARCHENGINE for
> details
> >on the command.You can also use ARGSINCLUDE command
> >(http://analog.cx/docs/args.html#ARGSINCLUDE) if you just want to look at
> >certain arguments on a given URL. This breaks down the arguments for the
> URL in
> >the Request Report.--Jeremy Wadsack
>
>
> Hi Jeremy - the Search reports don't see to work. I am guessing that is
> because
> the URLs are in the "request" and they are not "referrer" values.
>
> I don't see any documentation on how to explicitly report on certain
> arguments.
>
> Would it be:
>
> ARGSINCLUDE /search site, q, client
>
> or is there some other format?
>
> Thanks
> Terry
>
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>
Re: Re: Reporting on elements in the request line [ In reply to ]
Jeremy Wadsack <jeremy.wadsack@gmail.com> wrote:
>> Sorry, you're right, the ARGSINCLUDE just says which page to break
>> out into arguments. You can't select arguments from the list.
>>
>> The Search Engine reports looks at the referrer. The Internal Search
>> Engine reports look at the request data.
>>
>> So from the example in the docs in that link below you would use
>> something like this:
>>
>> INTSEARCHENGINE /search q
>>
>> You may have to run separate reports or maybe just extra lines to
>> add the 'client' and 'site' parameters. Test it out and see what
>> happens.

As far as I can tell, INTSEARCHENGINE will only take the first entry in the .cfg file, if you specify the same Engine with multiple parameters. So you can see either the q, client or site field in any one report, you can't get them all in the same report.

And you'll probably need to add
INTSEARCHQUERY ON
to see anything


>> Note that Analog does not do multivariate reports so you won't be
>> able to get a report of site parameter by client, for example.

Actually, in this case you _might_ be able to do something like that with some FILEALIAS commands.

FILEALIAS /search?*client=*&*&site=*&* /$4?$2

would give you a Request Report that showed how many requests were made by each client type to each site.

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Reporting on elements in the request line [ In reply to ]
Aengus <Analog07@...> writes:

> As far as I can tell, INTSEARCHENGINE will only take the first entry in the
.cfg file, if you specify the same
> Engine with multiple parameters. So you can see either the q, client or site
field in any one report, you
> can't get them all in the same report.
>
> And you'll probably need to add
> INTSEARCHQUERY ON
> to see anything
>
> >> Note that Analog does not do multivariate reports so you won't be
> >> able to get a report of site parameter by client, for example.
>
> Actually, in this case you _might_ be able to do something like that with some
FILEALIAS commands.
>
> FILEALIAS /search?*client=*&*&site=*&* /$4?$2
>
> would give you a Request Report that showed how many requests were made by
each client type to each site.
>
> Aengus
>

Hi Aengus - using the INTSEARCHENGINE and INTSEARCHQUERY options, the report
came through correctly. Thanks!

Using the FILEALIAS also works to break down the data, which is also great!

If I wanted to have BOTH the Internal Search report and also break the Request
report down this way, is there an option to allow that? I am guessing no since
the Internal Search report goes away if I add the Filealias line in.

Thanks for the suggestions!
Terry

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Re: Reporting on elements in the request line [ In reply to ]
Terry Chambers <terry.chambers@gmail.com> wrote:
>
> Hi Aengus - using the INTSEARCHENGINE and INTSEARCHQUERY options, the
> report came through correctly. Thanks!
>
> Using the FILEALIAS also works to break down the data, which is also
> great!
>
> If I wanted to have BOTH the Internal Search report and also break
> the Request report down this way, is there an option to allow that?
> I am guessing no since the Internal Search report goes away if I add
> the Filealias line in.

The FILEALIAS line basically changes line before Analog analyses it, so it doesn't match your INTSEARCHENGINE entry.

You might be able to get something useful by changing the FILEALIAS command:

FILEALIAS /search?*client=*&*&site=*&* /$4?client=$2
INTSEARCHENGINE /* client

That assumes that Client is the field that you want in the Search Report.

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Re: Reporting on elements in the request line [ In reply to ]
On Tue, Nov 10, 2009 at 2:40 PM, Aengus <Analog07@eircom.net> wrote:

>
> FILEALIAS /search?*client=*&*&site=*&* /$4?client=$2
>

Also, note that depending on your backend code and how the URL's are created
you may need to use two FILEALIAS commands to catch all of this:


FILEALIAS /search?*client=*&*&site=*&* /$4?client=$2
FILEALIAS /search?*site=*&*&client=*&* /$2?client=$4

You could check your server logs and see if the parameters ever change
order.

--
Jeremy Wadsack
Re: Reporting on elements in the request line [ In reply to ]
Jeremy Wadsack <jeremy.wadsack@...> writes:

>
>
> On Tue, Nov 10, 2009 at 2:40 PM, Aengus
<Analog07@eircom.net> wrote:
>
> FILEALIAS /search?*client=*&*&site=*&* /$4?client=$2
>
>
> Also, note that depending on your backend code and how the URL's are created
you may need to use two FILEALIAS commands to catch all of this:
> FILEALIAS /search?*client=*&*&site=*&* /$4?client=$2
> FILEALIAS /search?*site=*&*&client=*&* /$2?client=$4You could check your
server logs and see if the parameters ever change order.--Jeremy Wadsack 
>


The parameters do change order and there can be any number of parameters in the
middle or before.

Also, for the "Internal Search" report, I want to report off of the "q" value so
it looks like I'll have to run two separate reports to get that data since the
FILEALIAS mangles things up for that.

Thanks for the assistance and guidance.

Terry

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------