Mailing List Archive

Failure cenarios?
I have come to understand that in some builds under some conditions varnish
may hang or a crash. (we run 1.0.4-3el4.i386.rpm)

I have now routed all our ~180 sites troug varnish, pipe by default, cached
for selected hostnames. Talk a bout all ones eggs in one basket :)

The way it is all set up, we have varnsih on port 80 on one IP and apache on
80 on another whch is not in use for anything directly.
If anything should hapen to varnish, or we need to upgrade or anything, 3
lines to netfilter wil reroute all the trafic directly to apache.

Now the question is, how do I best detect if varnsih should have a problem?
Would it be reasonably reliable to just chek if the pid
from /var/run/varnish.pid is running, do I need to fetch a page, or is there
some better way?

Gaute
Failure cenarios? [ In reply to ]
----- Gaute Amundsen <gaute at pht.no> wrote:
> I have come to understand that in some builds under some conditions
> varnish
> may hang or a crash. (we run 1.0.4-3el4.i386.rpm)

Hi Gaute,

I'll just say that in my experience Varnish has proven itself to be extremely stable. We actually run 1.0.3 across the board (yes I know there are known bugs, however we do not experience them at all) and Varnish currently serves up all requests at www.startsiden.no and www.abcnyheter.no. If anything breaks, it has not so far been varnish.

However our scenario is pretty different from yours, we have very few vhosts but each has a very high amount of traffic. There is little or no advanced VCL configuration at all on our sites. We're pretty close to the default. The two sites have a different setup with regards to placement of Varnish. One site runs with dedicated varnish servers, the other has varnish and apache2 on the same box.

> Now the question is, how do I best detect if varnsih should have a
> problem?
> Would it be reasonably reliable to just chek if the pid
> from /var/run/varnish.pid is running, do I need to fetch a page, or is
> there
> some better way?

Well, we always monitor as high up as possible to make sure everything works on all levels. Lower level monitoring is useful too, but for pinpointing with more accuracy where the problem is.

Regards
--
Denis Braekhus - Teknisk Ansvarlig ABC Startsiden AS
http://www.startsiden.no
Failure cenarios? [ In reply to ]
Gaute Amundsen <gaute at pht.no> writes:
> Now the question is, how do I best detect if varnsih should have a
> problem? Would it be reasonably reliable to just chek if the pid from
> /var/run/varnish.pid is running, do I need to fetch a page, or is
> there some better way?

I would recommend retrieving a page (or a set of pages). Simply
checking the pid won't help you if Varnish has gone off into la-la land,
or been SIGSTOPped or something.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
Failure cenarios? [ In reply to ]
On Tuesday 03 July 2007 11:27, Dag-Erling Sm?rgrav wrote:
> Gaute Amundsen <gaute at pht.no> writes:
> > Now the question is, how do I best detect if varnsih should have a
> > problem? Would it be reasonably reliable to just chek if the pid from
> > /var/run/varnish.pid is running, do I need to fetch a page, or is
> > there some better way?
>
> I would recommend retrieving a page (or a set of pages). Simply
> checking the pid won't help you if Varnish has gone off into la-la land,
> or been SIGSTOPped or something.
>
> DES

Not what I _wanted_ to hear, but what I expected i guess :)

Gaute
Failure cenarios? [ In reply to ]
Gaute Amundsen wrote:
> On Tuesday 03 July 2007 11:27, Dag-Erling Sm?rgrav wrote:
>
>>
>> I would recommend retrieving a page (or a set of pages). Simply
>> checking the pid won't help you if Varnish has gone off into la-la land,
>> or been SIGSTOPped or something.
>>
>> DES
>>
> Not what I _wanted_ to hear, but what I expected i guess :)
>

I use monit for monitoring programs. Here is a snippet I had used when
monitoring a varnish install (too bad it never went into production;
change values to you liking / environment):

##
## Check Varnishd
check process varnishd with pidfile /var/run/varnishd.pid
start program = "/etc/init.d/varnishd start"
stop program = "/etc/init.d/varnishd stop"
if cpu > 60% for 2 cycles then alert
if cpu > 80% for 5 cycles then alert
if children > 50 then alert
if loadavg(5min) greater than 5 for 2 cycles then alert
if 3 restarts within 3 cycles then timeout
if failed host ipaddy port 80 type tcp then restart
if failed host ipaddy port 8080 type tcp send "ping\r\n" then restart

Monit also allows you to check the response using a regex, though I
never got it to work. Check the manual at
http://www.tildeslash.com/monit/doc/manual.php#connection_testing

Also, maybe you can use swatch to monitor its log file for nasty things
and automatically restart it / email you when it happens?

-- james
Failure cenarios? [ In reply to ]
James Quacinella wrote:
> I use monit for monitoring programs. Here is a snippet I had used when
> monitoring a varnish install (too bad it never went into production;
> change values to you liking / environment):
>

A Billion thanks to you :)
I have been looking for a tool that could do this for me.

Regards
A.S