Mailing List Archive: child process restarts frequently

child process restarts frequently

Jul 4, 2007, 3:46 AM

Post #1 of 9 (870 views)

We have a problem with the child process regularly dying and restarting.

We use Varnish in front of three backends, each serving different types
of content. One of these backends is quite slow, and initially we
thought the problem with the child process was related to this slow
backend. We hoped that upgrading from 1.0.3 to 1.0.4 would solve this,
but unfortunately is has not.

We have a 64-bits machine with a 400G disk cache.

Below I have attached excerpts from what I think is relevant parts of
the log( with some description in between) for one of these restart events:

-------------------------------------

First request with id 138012248 is fetched and inserted in the cache:

76 ReqStart c xxx.xx.xxx.xx 4589 138012248
76 RxRequest c GET
76 RxURL c
/weatherapi/locationforecast/1.1/?lat=59.2082531856325;lon=
9.60892344715165
76 RxProtocol c HTTP/1.1
76 VCL_call c recv
76 VCL_return c lookup
76 VCL_call c hash
76 VCL_return c hash
76 VCL_call c miss
76 VCL_return c fetch
55 BackendXID b 138012248
76 Backend c 55 weatherapi
55 TxRequest b GET
55 TxURL b
/weatherapi/locationforecast/1.1/?lat=59.2082531856325;lon=
9.60892344715165
55 TxProtocol b HTTP/1.1
55 TxHeader b X-Varnish: 138012248
55 TxHeader b X-Forwarded-for:
55 RxProtocol b HTTP/1.1
55 RxStatus b 203
55 RxResponse b Non-Authoritative Information
55 RxHeader b Date: Tue, 03 Jul 2007 14:07:08 GMT
55 RxHeader b Server: Apache/1.3.34 (Debian) mod_perl/1.29
55 RxHeader b Expires: Tue, 03 Jul 2007 15:00:00 GMT
55 RxHeader b Content-length: 115322
55 RxHeader b Content-Type: text/xml; charset=utf-8
76 ObjProtocol c HTTP/1.1
76 ObjStatus c 203
76 ObjResponse c Non-Authoritative Information
76 ObjHeader c Date: Tue, 03 Jul 2007 14:07:08 GMT
76 ObjHeader c Server: Apache/1.3.34 (Debian) mod_perl/1.29
76 ObjHeader c Expires: Tue, 03 Jul 2007 15:00:00 GMT
76 ObjHeader c Content-Type: text/xml; charset=utf-8
76 ObjHeader c Content-Length: 115322
55 BackendReuse b weatherapi
76 TTL c 138012248 RFC 3172 1183471628 1183471628
1183474800 0 0
76 VCL_call c fetch
76 VCL_return c insert
76 Length c 115322
76 TxProtocol c HTTP/1.1
76 TxStatus c 203
76 TxResponse c Non-Authoritative Information
76 ObjHeader c Date: Tue, 03 Jul 2007 14:07:08 GMT
76 ObjHeader c Server: Apache/1.3.34 (Debian) mod_perl/1.29
76 ObjHeader c Expires: Tue, 03 Jul 2007 15:00:00 GMT
76 ObjHeader c Content-Type: text/xml; charset=utf-8
76 ObjHeader c Content-Length: 115322
55 BackendReuse b weatherapi
76 TTL c 138012248 RFC 3172 1183471628 1183471628
1183474800 0 0
76 VCL_call c fetch
76 VCL_return c insert
76 Length c 115322
76 TxProtocol c HTTP/1.1
76 TxStatus c 203
76 TxResponse c Non-Authoritative Information
76 TxHeader c Date: Tue, 03 Jul 2007 14:07:08 GMT
76 TxHeader c Server: Apache/1.3.34 (Debian) mod_perl/1.29
76 TxHeader c Expires: Tue, 03 Jul 2007 15:00:00 GMT
76 TxHeader c Content-Type: text/xml; charset=utf-8
76 TxHeader c Content-Length: 115322
76 TxHeader c X-Varnish: 138012248
76 TxHeader c Age: 0
76 TxHeader c Via: 1.1 varnish
76 ReqEnd c 138012248 1183471628.619347000
1183471628.781244000 0.05796
5000 0.161765000 0.000132000
0 StatAddr xxx.xx.xxx.xxx 0 22038 3349 127106 0 0 18810
37429990 95687
09644

This object is then requested and delivered from cache several times.
Then we see the following in the log, at a time immediately after 14:59:27:

0 ExpPick 138012248
0 VCL_call timeout
0 VCL_return discard

...immediately following this( ca.14:59:33 ) we get:

0 ExpBan 138012248 was banned
0 CLI Rd vcl.load boot /tmp/vcl.XXfdES7y
0 CLI Wr 0 200 Loaded "/tmp/vcl.XXfdES7y" as "boot"

0 CLI Rd vcl.use boot
0 CLI Wr 0 200
0 CLI Rd start
0 CLI Wr 0 200

This pattern repeats itself regularly every 5 - 15 hours. I am not sure
if the request and the ExpBan tag is related to the child process
beeing restarted, but since a restart is always preceded by such a tag
it does seem likely.

The weird thing is that it seems to happen at the exact same time every
day. At 07:59 and 14:59 the child processs is restarted in the manner
shown above. We have a lot of requests which expires at each whole hour,
but I have no idea if this is related in any way.

---------------------------------------------

It would be great if anyone could explain this behaviour? What does the
ExpBan tag really mean? It seems kind of counterintuitive that a request
can be banned after beeing discarded....

Also, please let me know if you need any more information to be able to
look into the problem.

Best regards,
H?vard Futs?ter