Mailing List Archive

Description of varnishstat-items
Hi.

We're trying to understand some of the numbers from varnishstat.

In particular, we're wondering when:
Client connections accepted
Client requests received
are increased.

Are they increased when the connection is accepted and when the
request is received, or will Varnish wait until the request is
handled?

The reason I ask, is that we have some backends that are unable to
handle the traffic if there are a lot of cache misses. This will
result in a lot of hanging requests. At the moment, we're rejecting
all the requests to these backends. However, if we turn it back on,
connections and requests decreases to less than 1/5 of what it was
before.

So basically, I'm wondering if this indicates that there's a
congestion in our network, or if it's just a natural consequence of
the requests taking longer to handle?



--
Trond Michelsen
Description of varnishstat-items [ In reply to ]
In message <20070921080551.GB22366 at crusaders.no>, Trond Michelsen writes:

>In particular, we're wondering when:
> Client connections accepted

This is increased when we have accepted the connection.
On systems with accept-filters this does usually not happen
until we have also received the first request.

> Client requests received

This is increased whenever we have complete request and starts
to service it.

>The reason I ask, is that we have some backends that are unable to
>handle the traffic if there are a lot of cache misses. This will
>result in a lot of hanging requests. At the moment, we're rejecting
>all the requests to these backends. However, if we turn it back on,
>connections and requests decreases to less than 1/5 of what it was
>before.
>
>So basically, I'm wondering if this indicates that there's a
>congestion in our network, or if it's just a natural consequence of
>the requests taking longer to handle?

I'm not sure I can answer that based on the information you have
provided...



--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Description of varnishstat-items [ In reply to ]
On Fri, Sep 21, 2007 at 08:22:12AM +0000, Poul-Henning Kamp wrote:
> In message <20070921080551.GB22366 at crusaders.no>, Trond Michelsen writes:
>>In particular, we're wondering when:
>> Client connections accepted
>
> This is increased when we have accepted the connection.
> On systems with accept-filters this does usually not happen
> until we have also received the first request.
>
>> Client requests received
>
> This is increased whenever we have complete request and starts
> to service it.

Thank you. That's pretty much what we expected.

>>The reason I ask, is that we have some backends that are unable to
>>handle the traffic if there are a lot of cache misses. This will
>>result in a lot of hanging requests. At the moment, we're rejecting
>>all the requests to these backends. However, if we turn it back on,
>>connections and requests decreases to less than 1/5 of what it was
>>before.

>> So basically, I'm wondering if this indicates that there's a
>> congestion in our network, or if it's just a natural consequence of
>> the requests taking longer to handle?
> I'm not sure I can answer that based on the information you have
> provided...

Yeah, I know I'm a bit vague here. Sorry about that. Here's a quick
overview of our setup:

We have two Varnish mashines. Varnish1 is a frontend for a webservice
that generates images. Varnish2 is a frontent for a webservice that
generates XML-content. These services runs on separate servers. The
Webserver fetches XML-data from Varnish2 to generate HTML-pages, which
contains links to the images that are served by Varnish1. So,
basically all traffic to Varnish2 comes from a single machine, while
the traffic to Varnish1 comes from "everyone".

The Webserver is located offsite. With a single FTP-connection, we get
a transfer rate of 250-300Mbps between the webserver and
Varnish2. Both varnish1 and 2 have a gigabit-connection to NIX. Both
Varnish machines are 2xdual-core Xeon servers with 8GB memory.

The backends for Varnish1 and 2 have NFS-mounted disks from the same
NFS-server.

Right now, we have added a rule to Varnish1, that will reject any
request for the images that are most time consuming to generate for
the backends. Whenever we remove this restriction, and allow all
requests to go through to the backend - the number of incoming
requests to Varnish1 decreases dramatically. This isn't necessarily
surprising, since the clients will have to wait for the first images
to come through before requesting the next.

The thing that does surprise us is that whenever we try to allow all
images from Varnish1, after about 30s the webserver becomes unable to
connect to Varnish2. Connects from the webserver to Varnish2 piles up,
and without the XML-data, it cannot generate any webpages at all.

So far we have not been able to determine if Varnish2 is refusing
connections, or if the connections never get to the varnish-machines
at all.

--
Trond Michelsen