Mailing List Archive

Re: Frequently updated site, sharing data with third-party websites - what effect on analog?
Natalia Lis <natalialis@gmail.com> wrote:
> Hello,
> I didn't set Analog and I have no background in IT. I'm just trying to
> interpret the results for my company's site. It is a financial website
> with a lot of graphs updated every minute and other sections which are
> also frequently updated. We also share our real-time graphs which are
> very often displayed on third-party websites.
>
> 1.Given that most of our pages contain frequently updated elements,
> what may be the effect of this on the cache issue? Is it reasonable to
> expect that visitors will be less likely to use the cached version and
> our results are less likely to be skewed by this problem?

It's not a "problem" as such, it's just something to be aware of when interpreting the data in your log files - some people will be able to "see your website" without actually making any connection to it, if they use a caching proxy server and an earlier user has already cached the requested pages, and other people may appear to be multiple people, if they access your site through an array of proxy servers.

In the case of "dynamic" data that is generated "on the fly" every time a user accesses it, there should be a "no-cache" header that will tell a caching proxy to always request a "fresh" copy. But if "frequently updated" means that you generate a new copy every 10 minutes, then it's quite possible that each copy may get cached somewhere.

Having said all that, caching doesn't usually have a major impact on the numbers, and, more importantly, that impact doesn't tend to change much. So even if there's an X% skew in the numbers, the skew is likely to be X% next month and the month after that, as long as there haven't been any major changes in the environment. A different site, though, might have a Y% skew, because their customer base is different, and their usage pattern is different.

> 2.What can be the effect of sharing our graphs on "Individual hosts
> served" category? If people see our graphs on a third-party website,
> those views will count as "requests" in Analog (right?) but will those
> people be included in the "individual hosts served" count or will
> Analog only see the page on which the graphs are displayed?

There are 2 different issues here - "Individual Hosts Served" means the number of end users who make a request against your server. "Sharing your graphs" could mean a number of different things - it could mean that some 3rd party server copies your graphs and puts it on their site, in which case you'll never see the requests for those graphs in your log files. Or it could mean that those sites simpley point to the data on your website, so that the end user sends the request to your server, and that end user is an "Individual Host Served". They key difference between such a "3rd party" visitor, and a direct visitor, is that the "3rd party" visitor will indicate that "3rd party" in their referrer field.

For example, if you go to the home page for Analog (http://analog.cx/), you'll see a button for Sourceforge on the right, below the blue box. The source for that image is the Sourceforge server. The log files for analog.cx don't contain any information about whether or not users ever see that Sourceforge button, but the logfiles for the sourceforge.net server will show that that image was requested by you, and that you were told to load that image by the analog.cx page.

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Frequently updated site, sharing data with third-party websites - what effect on analog? [ In reply to ]
Thanks a lot, this is very useful,

Our shared graphs are hosted on our sites. So just to make sure: if
(putting aside the cache and other issues) 1000 different people/hosts
view our graph on page XYZ (which is not our website), analog will
show that we've got at least 1000 requests from 1000 individual hosts
referred to us by the site "XYZ"?

Regarding cache, I have another question. Is it possible to cache a
page and only refresh some of its items (like graphs) without making a
new request for page? The reason I ask is because we have several
sites and for one of them, analog shows that the number of "requests
for pages" per week is three times smaller than the number of
"individual hosts served" per week. In other words, an individual host
requests less than one page. And this is precisely the site that does
not host any graphs – even the graphs which are displayed there come
from our other websites. Thus, as far as I know no files from this
website are displayed elsewhere and it would seem to me that in order
to make any requests from the site, one would have to actually visit
it and request at least one page. I know that one possible problem may
be the definition of a "page", but analog shows that all major files
on the site are .html


On Thu, Feb 21, 2008 at 3:16 PM, Aengus <analog07@eircom.net> wrote:
> Natalia Lis <natalialis@gmail.com> wrote:
> > Hello,
> > I didn't set Analog and I have no background in IT. I'm just trying to
> > interpret the results for my company's site. It is a financial website
> > with a lot of graphs updated every minute and other sections which are
> > also frequently updated. We also share our real-time graphs which are
> > very often displayed on third-party websites.
> >
> > 1.Given that most of our pages contain frequently updated elements,
> > what may be the effect of this on the cache issue? Is it reasonable to
> > expect that visitors will be less likely to use the cached version and
> > our results are less likely to be skewed by this problem?
>
> It's not a "problem" as such, it's just something to be aware of when interpreting the data in your log files - some people will be able to "see your website" without actually making any connection to it, if they use a caching proxy server and an earlier user has already cached the requested pages, and other people may appear to be multiple people, if they access your site through an array of proxy servers.
>
> In the case of "dynamic" data that is generated "on the fly" every time a user accesses it, there should be a "no-cache" header that will tell a caching proxy to always request a "fresh" copy. But if "frequently updated" means that you generate a new copy every 10 minutes, then it's quite possible that each copy may get cached somewhere.
>
> Having said all that, caching doesn't usually have a major impact on the numbers, and, more importantly, that impact doesn't tend to change much. So even if there's an X% skew in the numbers, the skew is likely to be X% next month and the month after that, as long as there haven't been any major changes in the environment. A different site, though, might have a Y% skew, because their customer base is different, and their usage pattern is different.
>
> > 2.What can be the effect of sharing our graphs on "Individual hosts
> > served" category? If people see our graphs on a third-party website,
> > those views will count as "requests" in Analog (right?) but will those
> > people be included in the "individual hosts served" count or will
> > Analog only see the page on which the graphs are displayed?
>
> There are 2 different issues here - "Individual Hosts Served" means the number of end users who make a request against your server. "Sharing your graphs" could mean a number of different things - it could mean that some 3rd party server copies your graphs and puts it on their site, in which case you'll never see the requests for those graphs in your log files. Or it could mean that those sites simpley point to the data on your website, so that the end user sends the request to your server, and that end user is an "Individual Host Served". They key difference between such a "3rd party" visitor, and a direct visitor, is that the "3rd party" visitor will indicate that "3rd party" in their referrer field.
>
> For example, if you go to the home page for Analog (http://analog.cx/), you'll see a button for Sourceforge on the right, below the blue box. The source for that image is the Sourceforge server. The log files for analog.cx don't contain any information about whether or not users ever see that Sourceforge button, but the logfiles for the sourceforge.net server will show that that image was requested by you, and that you were told to load that image by the analog.cx page.
>
> Aengus
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Frequently updated site, sharing data with third-party websites - what effect on analog? [ In reply to ]
Natalia Lis <natalialis@gmail.com> wrote:
> Thanks a lot, this is very useful,
>
> Our shared graphs are hosted on our sites. So just to make sure: if
> (putting aside the cache and other issues) 1000 different people/hosts
> view our graph on page XYZ (which is not our website), analog will
> show that we've got at least 1000 requests from 1000 individual hosts
> referred to us by the site "XYZ"?

Sort of.

If you have 5 different "3rd party" servers displaying your graphs (for example), you can get a report saying that 40% of the requests were referred from site 1, 25% from site 2, 20% from site 3, etc. But you can't get a report saying that 1000 individual hosts were referred from site 1, 700 from Site 2, etc. This is in part because any given user could have visited site 1, 2 and 3, and so they could be counted multiple times, whereas any given request can only occur once - so requests and "individual hosts" are different types of data, so they can't be reported on in exactly the same way.

(Note also that the Referrer field is optional - not all web servers log it. And not all web browsers report it, and it's something that can only be measured with the cooperation of the user).

> Regarding cache, I have another question. Is it possible to cache a
> page and only refresh some of its items (like graphs) without making a
> new request for page? The reason I ask is because we have several
> sites and for one of them, analog shows that the number of "requests
> for pages" per week is three times smaller than the number of
> "individual hosts served" per week. In other words, an individual host
> requests less than one page. And this is precisely the site that does
> not host any graphs – even the graphs which are displayed there come
> from our other websites. Thus, as far as I know no files from this
> website are displayed elsewhere and it would seem to me that in order
> to make any requests from the site, one would have to actually visit
> it and request at least one page. I know that one possible problem may
> be the definition of a "page", but analog shows that all major files
> on the site are .html

Hmmm. You've already covered the obvious explanation. But I wouldn't look at caching for the explanation here. I presume you have the Host report on (I think the Individual Hosts Served figure is only calculated if you're generating the Host Report). I'd change the Columns displayed for that report to show both Requests and Page requests:

HOSTCOLS RPb
HOSTSORTBY Requests

This will give you an indication of who is making Requests, but not making page requests.

Next, I'd have a look at the Status Report, and see if you're getting an unusual number of Redirects - these probably count as requests, rather than Page Requests.

Lastly, I'd have a look at the File Type report. You said that "analog shows that all major files on the site are .html" but you didn't say how you reached that conclusion. If you looked at the Request Report, and saw that it only lists .html requests, your configuration might have a "REQINCLUDE PAGES" command that excludes any non-Page requests from the report. The File Type Report (FILETYPE ON) will give you more detailed information about the type of requests being made against the server. (You can also change the File Type report to show Requests and Page Requests with TYPECOLS RPb)

The bottom line is that a Host gets into your log file, and therefor gets counted, when it makes a Request. The discrepancy that you're seeing is because not all Requests are Page Requests, so if you're seeing lots of Requests that aren't Page Requests, then your assumption about .html files must be incorrect. Once you find out what non-Page requests are causing this discrepancy, you can look at the referrers for those particular requests to see if the problem might be down to cached .html files, or, more likely, due to some dynamic content not being counted as a Page Request.

Sorry I can't give a cut and dried answer here - this is exactly the sort of problem that Analog is really good at solving, once you know the right questions to ask of your log files.

Hope that helps,

Aengus

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Frequently updated site, sharing data with third-party websites - what effect on analog? [ In reply to ]
Again, thank you for the answer,
I have read a little bit more about the relationship between IP and
unique visitors, but I'm not sure I get that: "This is in part because
any given user could have visited site 1, 2 and 3, and so they could
be counted multiple times". Assuming that the user would retain the
same IP, why would he be counted again if he made requests from
another site?

As for the second question I made an error. In fact I have no proof
that there no .html files on the site. I claimed there weren't any
because I looked at the files listed in the Request Report under the
header "Listing files with at least 20 requests…" But when I summed up
the number of requests I realized it was the same as "Successful
requests for pages". In other words it is only "pages" that are listed
there. Also, in the Host Report, it shows that there are many hosts
that make many requests but 0 requests for pages. Apparently, there
are some files on the site I'm not aware of.

Best,
Natalia

On Thu, Feb 21, 2008 at 6:44 PM, Aengus <analog07@eircom.net> wrote:
> Natalia Lis <natalialis@gmail.com> wrote:
> > Thanks a lot, this is very useful,
> >
> > Our shared graphs are hosted on our sites. So just to make sure: if
> > (putting aside the cache and other issues) 1000 different people/hosts
> > view our graph on page XYZ (which is not our website), analog will
> > show that we've got at least 1000 requests from 1000 individual hosts
> > referred to us by the site "XYZ"?
>
> Sort of.
>
> If you have 5 different "3rd party" servers displaying your graphs (for example), you can get a report saying that 40% of the requests were referred from site 1, 25% from site 2, 20% from site 3, etc. But you can't get a report saying that 1000 individual hosts were referred from site 1, 700 from Site 2, etc. This is in part because any given user could have visited site 1, 2 and 3, and so they could be counted multiple times, whereas any given request can only occur once - so requests and "individual hosts" are different types of data, so they can't be reported on in exactly the same way.
>
> (Note also that the Referrer field is optional - not all web servers log it. And not all web browsers report it, and it's something that can only be measured with the cooperation of the user).
>
> > Regarding cache, I have another question. Is it possible to cache a
> > page and only refresh some of its items (like graphs) without making a
> > new request for page? The reason I ask is because we have several
> > sites and for one of them, analog shows that the number of "requests
> > for pages" per week is three times smaller than the number of
> > "individual hosts served" per week. In other words, an individual host
> > requests less than one page. And this is precisely the site that does
> > not host any graphs – even the graphs which are displayed there come
> > from our other websites. Thus, as far as I know no files from this
> > website are displayed elsewhere and it would seem to me that in order
> > to make any requests from the site, one would have to actually visit
> > it and request at least one page. I know that one possible problem may
> > be the definition of a "page", but analog shows that all major files
> > on the site are .html
>
> Hmmm. You've already covered the obvious explanation. But I wouldn't look at caching for the explanation here. I presume you have the Host report on (I think the Individual Hosts Served figure is only calculated if you're generating the Host Report). I'd change the Columns displayed for that report to show both Requests and Page requests:
>
> HOSTCOLS RPb
> HOSTSORTBY Requests
>
> This will give you an indication of who is making Requests, but not making page requests.
>
> Next, I'd have a look at the Status Report, and see if you're getting an unusual number of Redirects - these probably count as requests, rather than Page Requests.
>
> Lastly, I'd have a look at the File Type report. You said that "analog shows that all major files on the site are .html" but you didn't say how you reached that conclusion. If you looked at the Request Report, and saw that it only lists .html requests, your configuration might have a "REQINCLUDE PAGES" command that excludes any non-Page requests from the report. The File Type Report (FILETYPE ON) will give you more detailed information about the type of requests being made against the server. (You can also change the File Type report to show Requests and Page Requests with TYPECOLS RPb)
>
> The bottom line is that a Host gets into your log file, and therefor gets counted, when it makes a Request. The discrepancy that you're seeing is because not all Requests are Page Requests, so if you're seeing lots of Requests that aren't Page Requests, then your assumption about .html files must be incorrect. Once you find out what non-Page requests are causing this discrepancy, you can look at the referrers for those particular requests to see if the problem might be down to cached .html files, or, more likely, due to some dynamic content not being counted as a Page Request.
>
> Sorry I can't give a cut and dried answer here - this is exactly the sort of problem that Analog is really good at solving, once you know the right questions to ask of your log files.
>
> Hope that helps,
>
>
> Aengus
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Frequently updated site, sharing data with third-party websites - what effect on analog? [ In reply to ]
Natalia Lis <natalialis@gmail.com> wrote:
> Again, thank you for the answer,
> I have read a little bit more about the relationship between IP and
> unique visitors, but I'm not sure I get that: "This is in part because
> any given user could have visited site 1, 2 and 3, and so they could
> be counted multiple times". Assuming that the user would retain the
> same IP, why would he be counted again if he made requests from
> another site?

He woouldn't, but your original question was "if 1000 different people/hosts view our graph on page XYZ analog will show that we've got at least 1000 requests from 1000 individual hosts referred to us by the site XYZ?". Analog will tell you how many requests were referred from site XYZ, but it won't tell you how many individual hosts were referred from XYZ, because some of those hosts may also have been referred from PQR and from ABC as well. So if an individual user vists 3 different 3rd party sites, you'll have 3 requests in your log files, referred from 3 different sites, but Analog will only count 1 unique Host.

> As for the second question I made an error. In fact I have no proof
> that there no .html files on the site. I claimed there weren't any
> because I looked at the files listed in the Request Report under the
> header "Listing files with at least 20 requests…" But when I summed up
> the number of requests I realized it was the same as "Successful
> requests for pages". In other words it is only "pages" that are listed
> there. Also, in the Host Report, it shows that there are many hosts
> that make many requests but 0 requests for pages. Apparently, there
> are some files on the site I'm not aware of.

There is a command "REQINCLUDE PAGES" that will cause Analog to ignore anything that isn't a Page Request when it is generating the Request Report (or, to put it another way, it turns the "Request Report" into a "Page Request Report"). You can comment out that command to see all the requests, or you can look at the File Type Report to see what other types of files are being requested.

Aengus

>
> Best,
> Natalia
>
> On Thu, Feb 21, 2008 at 6:44 PM, Aengus <analog07@eircom.net> wrote:
>> Natalia Lis <natalialis@gmail.com> wrote:
>>> Thanks a lot, this is very useful,
>>>
>>> Our shared graphs are hosted on our sites. So just to make sure: if
>>> (putting aside the cache and other issues) 1000 different
>>> people/hosts view our graph on page XYZ (which is not our website),
>>> analog will show that we've got at least 1000 requests from 1000
>>> individual hosts referred to us by the site "XYZ"?
>>
>> Sort of.
>>
>> If you have 5 different "3rd party" servers displaying your graphs
>> (for example), you can get a report saying that 40% of the requests
>> were referred from site 1, 25% from site 2, 20% from site 3, etc.
>> But you can't get a report saying that 1000 individual hosts were
>> referred from site 1, 700 from Site 2, etc. This is in part because
>> any given user could have visited site 1, 2 and 3, and so they could
>> be counted multiple times, whereas any given request can only occur
>> once - so requests and "individual hosts" are different types of
>> data, so they can't be reported on in exactly the same way.
>>
>> (Note also that the Referrer field is optional - not all web servers
>> log it. And not all web browsers report it, and it's something that
>> can only be measured with the cooperation of the user).
>>
>>> Regarding cache, I have another question. Is it possible to cache a
>>> page and only refresh some of its items (like graphs) without
>>> making a new request for page? The reason I ask is because we have
>>> several sites and for one of them, analog shows that the number of
>>> "requests for pages" per week is three times smaller than the
>>> number of "individual hosts served" per week. In other words, an
>>> individual host requests less than one page. And this is precisely
>>> the site that does not host any graphs – even the graphs which are
>>> displayed there come from our other websites. Thus, as far as I
>>> know no files from this website are displayed elsewhere and it
>>> would seem to me that in order to make any requests from the site,
>>> one would have to actually visit it and request at least one page.
>>> I know that one possible problem may be the definition of a "page",
>>> but analog shows that all major files on the site are .html
>>
>> Hmmm. You've already covered the obvious explanation. But I wouldn't
>> look at caching for the explanation here. I presume you have the
>> Host report on (I think the Individual Hosts Served figure is only
>> calculated if you're generating the Host Report). I'd change the
>> Columns displayed for that report to show both Requests and Page
>> requests:
>>
>> HOSTCOLS RPb
>> HOSTSORTBY Requests
>>
>> This will give you an indication of who is making Requests, but not
>> making page requests.
>>
>> Next, I'd have a look at the Status Report, and see if you're
>> getting an unusual number of Redirects - these probably count as
>> requests, rather than Page Requests.
>>
>> Lastly, I'd have a look at the File Type report. You said that
>> "analog shows that all major files on the site are .html" but you
>> didn't say how you reached that conclusion. If you looked at the
>> Request Report, and saw that it only lists .html requests, your
>> configuration might have a "REQINCLUDE PAGES" command that excludes
>> any non-Page requests from the report. The File Type Report
>> (FILETYPE ON) will give you more detailed information about the type
>> of requests being made against the server. (You can also change the
>> File Type report to show Requests and Page Requests with TYPECOLS
>> RPb)
>>
>> The bottom line is that a Host gets into your log file, and therefor
>> gets counted, when it makes a Request. The discrepancy that you're
>> seeing is because not all Requests are Page Requests, so if you're
>> seeing lots of Requests that aren't Page Requests, then your
>> assumption about .html files must be incorrect. Once you find out
>> what non-Page requests are causing this discrepancy, you can look at
>> the referrers for those particular requests to see if the problem
>> might be down to cached .html files, or, more likely, due to some
>> dynamic content not being counted as a Page Request.
>>
>> Sorry I can't give a cut and dried answer here - this is exactly the
>> sort of problem that Analog is really good at solving, once you know
>> the right questions to ask of your log files.
>>
>> Hope that helps,
>>
>>
>> Aengus
>>
>> +------------------------------------------------------------------------
>>> TO UNSUBSCRIBE from this list:
>>> http://lists.meer.net/mailman/listinfo/analog-help
>>>
>>> Analog Documentation: http://analog.cx/docs/Readme.html
>>> List archives: http://www.analog.cx/docs/mailing.html#listarchives
>>> Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
>> +------------------------------------------------------------------------
>>
>
> +------------------------------------------------------------------------
>> TO UNSUBSCRIBE from this list:
>> http://lists.meer.net/mailman/listinfo/analog-help
>>
>> Analog Documentation: http://analog.cx/docs/Readme.html
>> List archives: http://www.analog.cx/docs/mailing.html#listarchives
>> Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------
Re: Frequently updated site, sharing data with third-party websites - what effect on analog? [ In reply to ]
Thanks again Aengus,
This is very helpful.

On Fri, Feb 22, 2008 at 6:00 PM, Aengus <analog07@eircom.net> wrote:
> Natalia Lis <natalialis@gmail.com> wrote:
> > Again, thank you for the answer,
> > I have read a little bit more about the relationship between IP and
> > unique visitors, but I'm not sure I get that: "This is in part because
> > any given user could have visited site 1, 2 and 3, and so they could
> > be counted multiple times". Assuming that the user would retain the
> > same IP, why would he be counted again if he made requests from
> > another site?
>
> He woouldn't, but your original question was "if 1000 different people/hosts view our graph on page XYZ analog will show that we've got at least 1000 requests from 1000 individual hosts referred to us by the site XYZ?". Analog will tell you how many requests were referred from site XYZ, but it won't tell you how many individual hosts were referred from XYZ, because some of those hosts may also have been referred from PQR and from ABC as well. So if an individual user vists 3 different 3rd party sites, you'll have 3 requests in your log files, referred from 3 different sites, but Analog will only count 1 unique Host.
>
> > As for the second question I made an error. In fact I have no proof
> > that there no .html files on the site. I claimed there weren't any
> > because I looked at the files listed in the Request Report under the
> > header "Listing files with at least 20 requests…" But when I summed up
> > the number of requests I realized it was the same as "Successful
> > requests for pages". In other words it is only "pages" that are listed
> > there. Also, in the Host Report, it shows that there are many hosts
> > that make many requests but 0 requests for pages. Apparently, there
> > are some files on the site I'm not aware of.
>
> There is a command "REQINCLUDE PAGES" that will cause Analog to ignore anything that isn't a Page Request when it is generating the Request Report (or, to put it another way, it turns the "Request Report" into a "Page Request Report"). You can comment out that command to see all the requests, or you can look at the File Type Report to see what other types of files are being requested.
>
> Aengus
>
>
> >
> > Best,
> > Natalia
> >
> > On Thu, Feb 21, 2008 at 6:44 PM, Aengus <analog07@eircom.net> wrote:
> >> Natalia Lis <natalialis@gmail.com> wrote:
> >>> Thanks a lot, this is very useful,
> >>>
> >>> Our shared graphs are hosted on our sites. So just to make sure: if
> >>> (putting aside the cache and other issues) 1000 different
> >>> people/hosts view our graph on page XYZ (which is not our website),
> >>> analog will show that we've got at least 1000 requests from 1000
> >>> individual hosts referred to us by the site "XYZ"?
> >>
> >> Sort of.
> >>
> >> If you have 5 different "3rd party" servers displaying your graphs
> >> (for example), you can get a report saying that 40% of the requests
> >> were referred from site 1, 25% from site 2, 20% from site 3, etc.
> >> But you can't get a report saying that 1000 individual hosts were
> >> referred from site 1, 700 from Site 2, etc. This is in part because
> >> any given user could have visited site 1, 2 and 3, and so they could
> >> be counted multiple times, whereas any given request can only occur
> >> once - so requests and "individual hosts" are different types of
> >> data, so they can't be reported on in exactly the same way.
> >>
> >> (Note also that the Referrer field is optional - not all web servers
> >> log it. And not all web browsers report it, and it's something that
> >> can only be measured with the cooperation of the user).
> >>
> >>> Regarding cache, I have another question. Is it possible to cache a
> >>> page and only refresh some of its items (like graphs) without
> >>> making a new request for page? The reason I ask is because we have
> >>> several sites and for one of them, analog shows that the number of
> >>> "requests for pages" per week is three times smaller than the
> >>> number of "individual hosts served" per week. In other words, an
> >>> individual host requests less than one page. And this is precisely
> >>> the site that does not host any graphs – even the graphs which are
> >>> displayed there come from our other websites. Thus, as far as I
> >>> know no files from this website are displayed elsewhere and it
> >>> would seem to me that in order to make any requests from the site,
> >>> one would have to actually visit it and request at least one page.
> >>> I know that one possible problem may be the definition of a "page",
> >>> but analog shows that all major files on the site are .html
> >>
> >> Hmmm. You've already covered the obvious explanation. But I wouldn't
> >> look at caching for the explanation here. I presume you have the
> >> Host report on (I think the Individual Hosts Served figure is only
> >> calculated if you're generating the Host Report). I'd change the
> >> Columns displayed for that report to show both Requests and Page
> >> requests:
> >>
> >> HOSTCOLS RPb
> >> HOSTSORTBY Requests
> >>
> >> This will give you an indication of who is making Requests, but not
> >> making page requests.
> >>
> >> Next, I'd have a look at the Status Report, and see if you're
> >> getting an unusual number of Redirects - these probably count as
> >> requests, rather than Page Requests.
> >>
> >> Lastly, I'd have a look at the File Type report. You said that
> >> "analog shows that all major files on the site are .html" but you
> >> didn't say how you reached that conclusion. If you looked at the
> >> Request Report, and saw that it only lists .html requests, your
> >> configuration might have a "REQINCLUDE PAGES" command that excludes
> >> any non-Page requests from the report. The File Type Report
> >> (FILETYPE ON) will give you more detailed information about the type
> >> of requests being made against the server. (You can also change the
> >> File Type report to show Requests and Page Requests with TYPECOLS
> >> RPb)
> >>
> >> The bottom line is that a Host gets into your log file, and therefor
> >> gets counted, when it makes a Request. The discrepancy that you're
> >> seeing is because not all Requests are Page Requests, so if you're
> >> seeing lots of Requests that aren't Page Requests, then your
> >> assumption about .html files must be incorrect. Once you find out
> >> what non-Page requests are causing this discrepancy, you can look at
> >> the referrers for those particular requests to see if the problem
> >> might be down to cached .html files, or, more likely, due to some
> >> dynamic content not being counted as a Page Request.
> >>
> >> Sorry I can't give a cut and dried answer here - this is exactly the
> >> sort of problem that Analog is really good at solving, once you know
> >> the right questions to ask of your log files.
> >>
> >> Hope that helps,
> >>
> >>
> >> Aengus
> >>
> >> +------------------------------------------------------------------------
> >>> TO UNSUBSCRIBE from this list:
> >>> http://lists.meer.net/mailman/listinfo/analog-help
> >>>
> >>> Analog Documentation: http://analog.cx/docs/Readme.html
> >>> List archives: http://www.analog.cx/docs/mailing.html#listarchives
> >>> Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> >> +------------------------------------------------------------------------
> >>
> >
> > +------------------------------------------------------------------------
> >> TO UNSUBSCRIBE from this list:
> >> http://lists.meer.net/mailman/listinfo/analog-help
> >>
> >> Analog Documentation: http://analog.cx/docs/Readme.html
> >> List archives: http://www.analog.cx/docs/mailing.html#listarchives
> >> Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> > +------------------------------------------------------------------------
>
> +------------------------------------------------------------------------
> | TO UNSUBSCRIBE from this list:
> | http://lists.meer.net/mailman/listinfo/analog-help
> |
> | Analog Documentation: http://analog.cx/docs/Readme.html
> | List archives: http://www.analog.cx/docs/mailing.html#listarchives
> | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
> +------------------------------------------------------------------------
>

+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------