Mailing List Archive

1 2  View All
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Thu, Nov 08, 2007 at 11:41:36PM +0100, Dag-Erling Sm?grav wrote:
> Gea-Suan Lin <gslin at gslin.org> writes:
> > I can see signal 3 (QUIT) from dmesg:
> >
> > pid 76061 (varnishd), uid 65534: exited on signal 3
> > pid 76187 (varnishd), uid 65534: exited on signal 3
> >
> > but I cannot find coredump. I run the following command in /tmp with
> > mode 1777:
>
> Read what I wrote earlier, you need to run 'ulimit -c unlimited'
> before starting Varnish.

I've tried 4 times, but varnishd still not to generate dump file. Here
is my step:

* /usr/local/sbin/varnishd.sh:

#!/bin/sh
ulimit -c unlimited
/home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096

* Run it in my home directory with large enough disk and suitable
permission directory:

gslin at testphp [~/tmp] (7:27) d
total 8
drwxrwxrwt 2 gslin admin 4096 Nov 9 07:09 ./
drwxr-xr-x 10 gslin admin 4096 Nov 9 07:19 ../
gslin at testphp [~/tmp] (7:27) df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/da0s1a 4.8G 89M 4.4G 2% /
devfs 1.0K 1.0K 0B 100% /dev
/dev/da0s1f 48G 7.5G 36G 17% /home
/dev/da0s1d 4.8G 3.2G 1.2G 72% /usr
/dev/da0s1e 4.8G 97M 4.4G 2% /var
/dev/da1s1d 66G 3.6G 57G 6% /home/logs
10.1.1.100:/vol/home 80G 2.4G 78G 3% /.amd_mnt/10.1.1.100/vol/home
gslin at testphp [~/tmp] (7:28) pwd
/.amd_mnt/10.1.1.100/vol/home/admin/gslin/tmp

dmesg's last messages is:

pid 76620 (varnishd), uid 65534: exited on signal 3
pid 76639 (varnishd), uid 65534: exited on signal 6

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
Gea-Suan Lin <gslin at gslin.org> writes:
> I've tried 4 times, but varnishd still not to generate dump file.

Are you sure? It should be in the run-time state directory,
most likely $PREFIX/var/varnish/$HOSTNAME.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Fri, Nov 09, 2007 at 12:32:37AM +0100, Dag-Erling Sm?grav wrote:
> Gea-Suan Lin <gslin at gslin.org> writes:
> > I've tried 4 times, but varnishd still not to generate dump file.
>
> Are you sure? It should be in the run-time state directory,
> most likely $PREFIX/var/varnish/$HOSTNAME.

The core file is not in /home/service/var/varnish/testphp.pixnet.tw/,
my guess is it's too small so I retry it:

* rm -rf /home/service/varnish/var/varnish/testphp.pixnet.tw
* ln -s ~/tmp /home/service/varnish/var/varnish/testphp.pixnet.tw
* chmod 1777 ~/tmp

So I got:

gslin at testphp [~/tmp] (7:47) d
total 8244
drwxrwxrwt 2 gslin admin 4096 Nov 9 07:40 ./
drwxr-xr-x 10 gslin admin 4096 Nov 9 07:33 ../
-rw-r--r-- 1 root wheel 8389208 Nov 9 07:43 _.vsl
-rwxr-xr-x 1 root wheel 14534 Nov 9 07:40 bin.LxjNthA8*
-rwxr-xr-x 1 gslin admin 260 Nov 9 07:33 varnishd.sh*
gslin at testphp [~/tmp] (7:47) cat varnishd.sh
#!/bin/sh

ulimit -c unlimited
/home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d
gslin at testphp [~/tmp] (7:47)

Run it and got SIGQUIT:

Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3
Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3

It's console:

gslin at testphp [~/tmp] (7:40) sudo ./varnishd.sh
storage_file: filename: /home/service/varnish-cache.mmap size 32768 MegaBytes.
Classic hash: 1048583 buckets
Using old SHMFILE
rolling(1)...
rolling(2)...
start
start child pid 76784
200 0

Child said (2, 76784): <<Child starts
sizeof(struct ws) = 48
sizeof(struct http) = 584
sizeof(struct http_conn) = 48
sizeof(struct acct) = 64
sizeof(struct worker) = 1232
sizeof(struct workreq) = 24
sizeof(struct bereq) = 656
sizeof(struct storage) = 72
sizeof(struct object) = 824
sizeof(struct objhead) = 56
sizeof(struct sess) = 448
sizeof(struct vbe_conn) = 48
sizeof(struct backend) = 88
managed to mmap 34359738368 bytes of 34359738368
Ready
CLI ready
>>
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Cache child died pid=76784 status=0x3
Clean child
Child cleaned
start child pid 76793
Child said (2, 76793): <<Child starts
sizeof(struct ws) = 48
sizeof(struct http) = 584
sizeof(struct http_conn) = 48
sizeof(struct acct) = 64
sizeof(struct worker) = 1232
sizeof(struct workreq) = 24
sizeof(struct bereq) = 656
sizeof(struct storage) = 72
sizeof(struct object) = 824
sizeof(struct objhead) = 56
sizeof(struct sess) = 448
sizeof(struct vbe_conn) = 48
sizeof(struct backend) = 88
managed to mmap 34359738368 bytes of 34359738368
Ready
CLI ready
>>
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Child not responding to ping
Cache child died pid=76793 status=0x3
Clean child
Child cleaned
start child pid 76794
Child said (2, 76794): <<Child starts
sizeof(struct ws) = 48
sizeof(struct http) = 584
sizeof(struct http_conn) = 48
sizeof(struct acct) = 64
sizeof(struct worker) = 1232
sizeof(struct workreq) = 24
sizeof(struct bereq) = 656
sizeof(struct storage) = 72
sizeof(struct object) = 824
sizeof(struct objhead) = 56
sizeof(struct sess) = 448
sizeof(struct vbe_conn) = 48
sizeof(struct backend) = 88
managed to mmap 34359738368 bytes of 34359738368
Ready
CLI ready
>>

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
Gea-Suan Lin <gslin at gslin.org> writes:
> gslin at testphp [~/tmp] (7:47) cat varnishd.sh
> #!/bin/sh
>
> ulimit -c unlimited
> /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d
> gslin at testphp [~/tmp] (7:47)
>
> Run it and got SIGQUIT:
>
> Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3
> Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3

Still no core file? Try SIGABRT instead. If that doesn't work, I'm
out of ideas... though you can still attach directly to the child
with gdb.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Fri, Nov 09, 2007 at 12:55:57AM +0100, Dag-Erling Sm?grav wrote:
> Gea-Suan Lin <gslin at gslin.org> writes:
> > gslin at testphp [~/tmp] (7:47) cat varnishd.sh
> > #!/bin/sh
> >
> > ulimit -c unlimited
> > /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d
> > gslin at testphp [~/tmp] (7:47)
> >
> > Run it and got SIGQUIT:
> >
> > Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3
> > Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3
>
> Still no core file? Try SIGABRT instead. If that doesn't work, I'm
> out of ideas... though you can still attach directly to the child
> with gdb.

okay I use SIGABRT now. If it still not able to generate, I'll try to
use gdb "generate-core-file" command to generate it.

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Fri, Nov 09, 2007 at 08:02:36AM +0800, Gea-Suan Lin wrote:
> On Fri, Nov 09, 2007 at 12:55:57AM +0100, Dag-Erling Sm?grav wrote:
> > Gea-Suan Lin <gslin at gslin.org> writes:
> > > gslin at testphp [~/tmp] (7:47) cat varnishd.sh
> > > #!/bin/sh
> > >
> > > ulimit -c unlimited
> > > /home/service/varnish/sbin/varnishd -a 60.199.247.118:80 -f /usr/local/etc/varnish/image.vcl -h classic,1048583 -P /var/run/varnishd.pid -s file,/home/service/varnish-cache.mmap,32G -T 127.0.0.1:11957 -t 604800 -w 32,4096 -d -d
> > > gslin at testphp [~/tmp] (7:47)
> > >
> > > Run it and got SIGQUIT:
> > >
> > > Nov 9 07:43:26 testphp kernel: pid 76784 (varnishd), uid 65534: exited on signal 3
> > > Nov 9 07:43:51 testphp kernel: pid 76793 (varnishd), uid 65534: exited on signal 3
> >
> > Still no core file? Try SIGABRT instead. If that doesn't work, I'm
> > out of ideas... though you can still attach directly to the child
> > with gdb.
>
> okay I use SIGABRT now. If it still not able to generate, I'll try to
> use gdb "generate-core-file" command to generate it.

I got this:

(gdb) generate-core-file
Couldn't open /proc/1005/map

I'll mount procfs and try again.

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
Gea-Suan Lin <gslin at gslin.org> writes:
> okay I use SIGABRT now. If it still not able to generate, I'll try to
> use gdb "generate-core-file" command to generate it.

One last thing to try: 'sysctl kern.coredump=1', though I don't see
why it would be 0. There's not much point in using gdb to generate a
core dump, though, as the entire point was to avoid having to attach
gdb to the child.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Fri, Nov 09, 2007 at 09:23:46AM +0100, Dag-Erling Sm?grav wrote:
> Gea-Suan Lin <gslin at gslin.org> writes:
> > okay I use SIGABRT now. If it still not able to generate, I'll try to
> > use gdb "generate-core-file" command to generate it.
>
> One last thing to try: 'sysctl kern.coredump=1', though I don't see
> why it would be 0. There's not much point in using gdb to generate a
> core dump, though, as the entire point was to avoid having to attach
> gdb to the child.

"kern.coredump" is already 1, but I found "kern.sugid_coredump" in
core(5):

By default, a process that changes user or group credentials
whether real or effective will not create a corefile. This
behaviour can be changed to generate a core dump by setting the
sysctl(8) variable kern.sugid_coredump to 1.

Since varnishd is run by root then setuid to nobody, this rule will
apply to. Change kern.sugid_coredump from 0 to 1 then the core file was
generated.

Core file and execute file is in:

http://mail.pixnet.tw/~gslin/tmp/varnishd.core
http://mail.pixnet.tw/~gslin/tmp/varnishd

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
Gea-Suan Lin <gslin at gslin.org> writes:
> Core file and execute file is in:
>
> http://mail.pixnet.tw/~gslin/tmp/varnishd.core
> http://mail.pixnet.tw/~gslin/tmp/varnishd

That won't help; at the very least, I'd need all the Varnish libraries
and access to a system with the exact same system libraries as yours.
What I asked for was the output of "i thr" in gdb.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
Varnish crash (SIGABRT) about every 10 mins [ In reply to ]
On Fri, Nov 09, 2007 at 09:50:31PM +0100, Dag-Erling Sm?grav wrote:
> Gea-Suan Lin <gslin at gslin.org> writes:
> > Core file and execute file is in:
> >
> > http://mail.pixnet.tw/~gslin/tmp/varnishd.core
> > http://mail.pixnet.tw/~gslin/tmp/varnishd
>
> That won't help; at the very least, I'd need all the Varnish libraries
> and access to a system with the exact same system libraries as yours.

All files in /home/service/varnish been package in:
http://mail.pixnet.tw/~gslin/tmp/varnish-all.tar.gz

But you may need *all libraries*, should I create a shell account to
let you access ?

> What I asked for was the output of "i thr" in gdb.

(gdb) i thr
12 Thread 0x53d000 (LWP 100256) 0x0000000800c3356c in poll () from /lib/libc.so.6
11 Thread 0x53d800 (LWP 100159) 0x0000000800c6373c in nanosleep () from /lib/libc.so.6
10 Thread 0x53da00 (LWP 100199) 0x0000000800c6373c in nanosleep () from /lib/libc.so.6
9 Thread 0x53dc00 (LWP 100212) 0x0000000800c6373c in nanosleep () from /lib/libc.so.6
8 Thread 0xa67d000 (LWP 100224) 0x0000000800c432fc in kevent () from /lib/libc.so.6
7 Thread 0xa67d200 (LWP 100225) 0x0000000800c3356c in poll () from /lib/libc.so.6
6 Thread 0xa67d400 (LWP 100226) 0x0000000800bfe07c in _umtx_op () from /lib/libc.so.6
5 Thread 0xa67d600 (LWP 100227) 0x0000000800c7bf5c in read () from /lib/libc.so.6
4 Thread 0xa67d800 (LWP 100228) 0x0000000800c7b926 in memcpy () from /lib/libc.so.6
3 Thread 0xa67da00 (LWP 100229) 0x0000000800c7b8f4 in memset () from /lib/libc.so.6
2 Thread 0xa67dc00 (LWP 100230) 0x0000000800bfe07c in _umtx_op () from /lib/libc.so.6
* 1 Thread 0xa67de00 (LWP 100231) 0x0000000800c7bf5c in read () from /lib/libc.so.6
(gdb) bt
#0 0x0000000800c7bf5c in read () from /lib/libc.so.6
#1 0x0000000800984fbb in read () from /usr/lib/libthr.so.2
#2 0x000000000041585e in HTC_Read (htc=0x7ffffe7f2a20, d=0x86dd6c000, len=160763) at cache_httpconn.c:202
#3 0x000000000041095f in Fetch (sp=0xa681008) at cache_fetch.c:72
#4 0x000000000040e42b in CNT_Session (sp=0xa681008) at cache_center.c:323
#5 0x0000000000416209 in wrk_thread (priv=0x53e5e0) at cache_pool.c:193
#6 0x000000080098729e in pthread_create () from /usr/lib/libthr.so.2
#7 0x0000000000000000 in ?? ()
Cannot access memory at address 0x7ffffe7f5000
(gdb) i fra
Stack level 0, frame at 0x7ffffe7f0930:
rip = 0x800c7bf5c in read; saved rip 0x800984fbb
called by frame at 0x7ffffe7f0960
Arglist at 0x7ffffe7f0920, args:
Locals at 0x7ffffe7f0920, Previous frame's sp is 0x7ffffe7f0930
Saved registers:
rip at 0x7ffffe7f0928
(gdb)

--
* Gea-Suan Lin (public key: Using https://keyserver.pgp.com/ to search)
* If you cannot convince them, confuse them. -- Harry S Truman

1 2  View All