Mailing List Archive

Varnish shutting down under heavy load
Hi All,

This is in continuation of the previous thread "Best Practices -
Suggestion Request"

The problem of varnish shutting down every 7-10 days still continues.
This resulted in significant amount of downtime, and hence i shifted all
but one server to "pound".
pound has been working fine.. but the one server that is running varnish
still kept crashing.

This is what i found in the logs
......
......
......
536 TxHeader c X-Varnish: 1278080177
536 TxHeader c Age: 4
536 TxHeader c Via: 1.1 varnish
536 ReqEnd c 1278080177 1183663532.742941000
1183663536.720536000 0.867369000 3.489705000 0.487890000
0 StatAddr 66.249.67.59 0 2220 340 4388 0 0 4373 1729613 349338293
0 WorkThread 0x2ccb851c60 start
0 CLI Rd vcl.load boot /tmp/vcl.XXBL9f1R
0 Error CLI write failed (errno=32)
[Log Ends Here]

Not that i never saw the logs, but i never make varnish save the log to
a file.
Yesterday night, i finally decided to let varnish log everything to a
file, and this is what i saw this morning.

Do i need to change something?

I am using the rpms provided for 1.0.4 on Centos 4.

/etc/sysconfig/varnish
---------------------------------------------------------------------
NFILES=131072
VARNISH_VCL_CONF=/etc/varnish/default.vcl
VARNISH_LISTEN_ADDRESS=XXX..XXX.XXX.XXX
VARNISH_LISTEN_PORT=80
# VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
# VARNISH_ADMIN_LISTEN_PORT=6082
VARNISH_MIN_THREADS=1
VARNISH_MAX_THREADS=1000
VARNISH_THREAD_TIMEOUT=120
VARNISH_STORAGE_FILE=/cache/varnish/varnish_storage.bin
VARNISH_STORAGE_SIZE=8G
VARNISH_STORAGE="file,${VARNISH_STORAGE_FILE},${VARNISH_STORAGE_SIZE}"
VARNISH_TTL=120

DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
-f ${VARNISH_VCL_CONF} \
-t ${VARNISH_TTL} \
-w
${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},${VARNISH_THREAD_TIMEOUT} \
-s ${VARNISH_STORAGE}"


/etc/varnish/default.vcl
---------------------------------------------------------------------
backend default {
set backend.host = "127.0.0.1";
set backend.port = "8000";
}

sub vcl_recv {
if (req.request == "GET" && req.http.cookie) {
lookup;
}
}

sub vcl_fetch {
if (!obj.valid) {
error;
}
if (!obj.cacheable) {
pass;
}
if (resp.http.Content-Type ~ "text/html") {
pass;
}
if (resp.http.Set-Cookie) {
insert;
}
insert;
}


Regards
A.S
Varnish shutting down under heavy load [ In reply to ]
I've seen the automatic restart of the worker process fail here
during my testing, and I suspect you hit the same thing.

I have it on my list of things to investigate RSN.

Do you have any corefiles from varnishd by any chance ?

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Varnish shutting down under heavy load [ In reply to ]
Poul-Henning Kamp wrote:
> I've seen the automatic restart of the worker process fail here
> during my testing, and I suspect you hit the same thing.
>
> I have it on my list of things to investigate RSN.
>
> Do you have any corefiles from varnishd by any chance ?
>
>

I just set ulimit -c unlimited and started varnish again.

Regards.
A.S
Varnish shutting down under heavy load [ In reply to ]
Anup Shukla wrote:
> Poul-Henning Kamp wrote:
>
>> I've seen the automatic restart of the worker process fail here
>> during my testing, and I suspect you hit the same thing.
>>
>> I have it on my list of things to investigate RSN.
>>
>> Do you have any corefiles from varnishd by any chance ?
>>
>>
>>
>
> I just set ulimit -c unlimited and started varnish again.
>
> Regards.
>

Did not take it long to go down again.
But i am not able to find any "core" dumps.
Some help as to how to get one will be appreciated.

The logs said

843 TxHeader c X-Varnish: 1617761482 1617756991
843 TxHeader c Age: 75
843 TxHeader c Via: 1.1 varnish
843 ReqEnd c 1617761482 1183726324.435635000
1183726324.435695000 4.934160000 0.000019000 0.000041000
0 StatAddr 124.64.56.126 0 31 11 15 0 0 1 4243 180969
1017 SessionOpen c 58.214.216.252 3364
0 CLI Rd vcl.load boot /tmp/vcl.XXnkvV6y
0 CLI Wr 0 200 Loaded "/tmp/vcl.XXnkvV6y" as "boot"

0 Error CLI read 0 (errno=4)


> A.S
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>
>
Varnish shutting down under heavy load [ In reply to ]
In message <468E3D6C.9080000 at iamcool.net>, Anup Shukla writes:
>Anup Shukla wrote:

>Did not take it long to go down again.
>But i am not able to find any "core" dumps.
>Some help as to how to get one will be appreciated.

Ok, next step:

Run varnishd in the forground by giving it two -d options:

varnishd -d -d [your other options]



--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Varnish shutting down under heavy load [ In reply to ]
Poul-Henning Kamp wrote:
> In message <468E3D6C.9080000 at iamcool.net>, Anup Shukla writes:
>
>> Anup Shukla wrote:
>>
>
>
>> Did not take it long to go down again.
>> But i am not able to find any "core" dumps.
>> Some help as to how to get one will be appreciated.
>>
>
> Ok, next step:
>
> Run varnishd in the forground by giving it two -d options:
>
> varnishd -d -d [your other options]
>

This is the output from the debug console (a lot of "Child not
responding" messages scrolled beyond the screen buffer)
=============================
.....
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Cache child died pid=29313 status=0x86
Clean child
Child cleaned
start child pid 2961
Child said (2, 2961): <<Child starts
managed to mmap 8589934592 bytes of 8589934592
Ready
CLI ready
>>
Child said (2, 2961): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Cache child died pid=2961 status=0x9
Clean child
Child cleaned
start child pid 5942
Pushing vcls failed:
CLI communication error
Child said (1, 5942): <<Child starts
>>
unlink /tmp/vcl.XXxtW5RB


varnish.log
=============================
....
199 TxHeader c ETag: "56373-7414-43120356ea200"
199 TxHeader c Content-Type: application/x-shockwave-flash
199 TxHeader c Content-Length: 29716
199 TxHeader c X-Varnish: 1888641652
199 TxHeader c Age: 0
199 TxHeader c Via: 1.1 varnish
199 ReqEnd c 1888641652 1183753467.604381000
1183753469.254338000 0.000404000 0.764325000 0.885632000
0 StatAddr 218.164.51.156 0 1534 136 274 2 0 188 62079 46278586
0 CLI Rd vcl.load boot /tmp/vcl.XXxtW5RB
0 Error CLI write failed (errno=32)


The core dump is 11gb in size !!

gdb [varnishd] [corefile]
=============================
Core was generated by `varnishd -d -d -a 192.168.100.109:80 -f
/etc/varnish/default.vcl -s file,/cache'.
Program terminated with signal 6, Aborted.
Error while mapping shared library sections:
/tmp/vcl.XXxtW5RB: No such file or directory.
Reading symbols from /usr/lib64/libvarnish.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libvarnish.so.0
Reading symbols from /usr/lib64/libvcl.so.0...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libvcl.so.0
Reading symbols from /lib64/libdl.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/tls/librt.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/tls/librt.so.1
Reading symbols from /lib64/tls/libpthread.so.0...(no debugging symbols
found)...done.
Loaded symbols for /lib64/tls/libpthread.so.0
Reading symbols from /lib64/tls/libc.so.6...
(no debugging symbols found)...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Error while reading shared library symbols:
/tmp/vcl.XXxtW5RB: No such file or directory.
#0 0x000000386742e21d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0 0x000000386742e21d in raise () from /lib64/tls/libc.so.6
#1 0x000000386742fa1e in abort () from /lib64/tls/libc.so.6
#2 0x000000321d502205 in lbv_assert () from /usr/lib64/libvarnish.so.0
#3 0x000000000041b9fc in VSL_MgtInit ()
#4 0x000000000040a611 in Fetch ()
#5 0x00000000004086f3 in CNT_Session ()
#6 0x000000000040ed0b in WRK_Sendfile ()
#7 0x000000000040ef12 in WRK_Sendfile ()
#8 0x0000003868106137 in start_thread () from /lib64/tls/libpthread.so.0
#9 0x00000038674c7113 in clone () from /lib64/tls/libc.so.6

Anything else that i should do?

Regards
A.S