Mailing List Archive

5.1.3: unexplained lockup during several minutes
Hello,

I'm struggling figuring out a problem that occurs rarely, and random
times, on different machines with the same setup (so excluding hardware
problems). This leads to varnish not answering any requests anymore for
several minutes.

Running on a 8 CPU amazon EC2 instance with the following parameters:
varnishd -F -f /etc/varnish/config.vcl -I /etc/varnish/settings.cli -n varnishd -s malloc,8G -a 127.0.0.1:8080,PROXY

When the lockup occurs, the loadavg increases from 0 to about 20 and 2
out of 8 CPUs are 100% in iowait mode (maybe because I use 2 threadpools ?)

There is plenty of free memory, and no disk IO activity, nothing in
dmesg. Apart from varnish, there's haproxy and a couple of varnishlog
and other monitoring tools running on these machines.

The 2 non-default settings I set are:
thread_pool_min 1000
timeout_idle 45

On specificity: the backends set quite short TTLs on the objects they
produce (less than 10s), but grace time is set to 300s, so I was
wondering is there might be a side-effect with the Transient store.

Any hint or clue would be much welcome !

Below are the log messages and the output of varnishstat -1:

2018-10-24T02:40:27.455363000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:40:27.455899000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:41:27.512111000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:41:27.512321000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:42:27.571685000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:42:27.571883000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:43:27.598954000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:43:27.599156000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:44:27.645895000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:44:27.646108000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:45:03.252791000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:45:03.253069000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-10-24T02:45:03.253274000Z Error: Child (14) not responding to CLI, killed it.
2018-10-24T02:45:03.253465000Z Error: Unexpected reply from ping: 400 CLI communication error
2018-10-24T02:45:03.253769000Z Error: Child (14) died signal=3 (core dumped)
2018-10-24T02:45:03.254007000Z Debug: Child cleanup complete
2018-10-24T02:45:03.255242000Z Debug: Child (2202) Started
2018-10-24T02:45:03.339992000Z Info: Child (2202) said Child starts

2018-11-06T20:03:29.675883000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:03:29.676394000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:04:29.701217000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:04:29.701422000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:05:29.758068000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:05:29.758269000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:06:29.781659000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:06:29.781867000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:07:29.804125000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:07:29.804336000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:08:29.864111000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:08:29.864318000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:09:29.873115000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:09:29.873321000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:10:29.932102000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:10:29.932307000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:11:29.992180000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:11:29.992387000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:12:30.052123000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:12:30.052328000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:13:30.112074000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:13:30.112275000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:14:30.172123000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:14:30.172326000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:15:03.116961000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:15:03.117291000Z Error: Unexpected reply from ping: 400 CLI communication error (hdr)
2018-11-06T20:15:03.117517000Z Error: Child (2202) not responding to CLI, killed it.
2018-11-06T20:15:03.117747000Z Error: Unexpected reply from ping: 400 CLI communication error
2018-11-06T20:15:03.119003000Z Error: Child (2202) died signal=3 (core dumped)
2018-11-06T20:15:03.119377000Z Debug: Child cleanup complete
2018-11-06T20:15:03.120560000Z Debug: Child (4327) Started
2018-11-06T20:15:03.207214000Z Info: Child (4327) said Child starts

MAIN.uptime 244472 1.00 Child process uptime
MAIN.sess_conn 10813893 44.23 Sessions accepted
MAIN.sess_drop 0 0.00 Sessions dropped
MAIN.sess_fail 0 0.00 Session accept failures
MAIN.client_req_400 5 0.00 Client requests received, subject to 400 errors
MAIN.client_req_417 0 0.00 Client requests received, subject to 417 errors
MAIN.client_req 11329721 46.34 Good client requests received
MAIN.cache_hit 9785347 40.03 Cache hits
MAIN.cache_hitpass 0 0.00 Cache hits for pass.
MAIN.cache_hitmiss 87463 0.36 Cache hits for miss.
MAIN.cache_miss 439975 1.80 Cache misses
MAIN.backend_conn 1233017 5.04 Backend conn. success
MAIN.backend_unhealthy 0 0.00 Backend conn. not attempted
MAIN.backend_busy 0 0.00 Backend conn. too many
MAIN.backend_fail 0 0.00 Backend conn. failures
MAIN.backend_reuse 1641618 6.71 Backend conn. reuses
MAIN.backend_recycle 1647099 6.74 Backend conn. recycles
MAIN.backend_retry 4 0.00 Backend conn. retry
MAIN.fetch_head 6 0.00 Fetch no body (HEAD)
MAIN.fetch_length 264385 1.08 Fetch with Length
MAIN.fetch_chunked 15594 0.06 Fetch chunked
MAIN.fetch_eof 1226485 5.02 Fetch EOF
MAIN.fetch_bad 0 0.00 Fetch bad T-E
MAIN.fetch_none 1 0.00 Fetch no body
MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
MAIN.fetch_204 8 0.00 Fetch no body (204)
MAIN.fetch_304 1367111 5.59 Fetch no body (304)
MAIN.fetch_failed 2 0.00 Fetch failed (all causes)
MAIN.fetch_no_thread 0 0.00 Fetch failed (no thread)
MAIN.pools 2 . Number of thread pools
MAIN.threads 2000 . Total number of threads
MAIN.threads_limited 0 0.00 Threads hit max
MAIN.threads_created 2012 0.01 Threads created
MAIN.threads_destroyed 12 0.00 Threads destroyed
MAIN.threads_failed 0 0.00 Thread creation failed
MAIN.thread_queue_len 0 . Length of session queue
MAIN.busy_sleep 3558 0.01 Number of requests sent to sleep on busy objhdr
MAIN.busy_wakeup 3558 0.01 Number of requests woken after sleep on busy objhdr
MAIN.busy_killed 0 0.00 Number of requests killed after sleep on busy objhdr
MAIN.sess_queued 1288 0.01 Sessions queued for thread
MAIN.sess_dropped 0 0.00 Sessions dropped for thread
MAIN.n_object 1645 . object structs made
MAIN.n_vampireobject 0 . unresurrected objects
MAIN.n_objectcore 3407 . objectcore structs made
MAIN.n_objecthead 3649 . objecthead structs made
MAIN.n_backend 2 . Number of backends
MAIN.n_expired 438348 . Number of expired objects
MAIN.n_lru_nuked 0 . Number of LRU nuked objects
MAIN.n_lru_moved 5255908 . Number of LRU moved objects
MAIN.losthdr 0 0.00 HTTP header overflows
MAIN.s_sess 10813893 44.23 Total sessions seen
MAIN.s_req 11329721 46.34 Total requests seen
MAIN.s_pipe 1724 0.01 Total pipe sessions seen
MAIN.s_pass 1017203 4.16 Total pass-ed requests seen
MAIN.s_fetch 1457178 5.96 Total backend fetches initiated
MAIN.s_synth 85381 0.35 Total synthethic responses made
MAIN.s_req_hdrbytes 6238979422 25520.22 Request header bytes
MAIN.s_req_bodybytes 197010427 805.86 Request body bytes
MAIN.s_resp_hdrbytes 6906180092 28249.37 Response header bytes
MAIN.s_resp_bodybytes 1332774227295 5451643.65 Response body bytes
MAIN.s_pipe_hdrbytes 2686309 10.99 Pipe request header bytes
MAIN.s_pipe_in 1409866 5.77 Piped bytes from client
MAIN.s_pipe_out 17882349 73.15 Piped bytes to client
MAIN.sess_closed 33230 0.14 Session Closed
MAIN.sess_closed_err 5187 0.02 Session Closed with error
MAIN.sess_readahead 0 0.00 Session Read Ahead
MAIN.sess_herd 4943201 20.22 Session herd
MAIN.sc_rem_close 10784144 44.11 Session OK REM_CLOSE
MAIN.sc_req_close 21829 0.09 Session OK REQ_CLOSE
MAIN.sc_req_http10 932 0.00 Session Err REQ_HTTP10
MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
MAIN.sc_rx_junk 5 0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout 4250 0.02 Session Err RX_TIMEOUT
MAIN.sc_tx_pipe 1724 0.01 Session OK TX_PIPE
MAIN.sc_tx_error 0 0.00 Session Err TX_ERROR
MAIN.sc_tx_eof 543 0.00 Session OK TX_EOF
MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
MAIN.sc_overload 0 0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow 0 0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short 0 0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure 0 0.00 Session Err VCL_FAILURE
MAIN.shm_records 921799607 3770.57 SHM records
MAIN.shm_writes 82947584 339.29 SHM writes
MAIN.shm_flushes 1821 0.01 SHM flushes due to overflow
MAIN.shm_cont 961287 3.93 SHM MTX contention
MAIN.shm_cycles 436 0.00 SHM cycles through buffer
MAIN.backend_req 2862111 11.71 Backend requests made
MAIN.n_vcl 2 0.00 Number of loaded VCLs in total
MAIN.n_vcl_avail 2 0.00 Number of VCLs available
MAIN.n_vcl_discard 0 0.00 Number of discarded VCLs
MAIN.vcl_fail 0 0.00 VCL failures
MAIN.bans 1 . Count of bans
MAIN.bans_completed 1 . Number of bans marked 'completed'
MAIN.bans_obj 0 . Number of bans using obj.*
MAIN.bans_req 0 . Number of bans using req.*
MAIN.bans_added 1 0.00 Bans added
MAIN.bans_deleted 0 0.00 Bans deleted
MAIN.bans_tested 0 0.00 Bans tested against objects (lookup)
MAIN.bans_obj_killed 0 0.00 Objects killed by bans (lookup)
MAIN.bans_lurker_tested 0 0.00 Bans tested against objects (lurker)
MAIN.bans_tests_tested 0 0.00 Ban tests tested against objects (lookup)
MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested against objects (lurker)
MAIN.bans_lurker_obj_killed 0 0.00 Objects killed by bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects killed by bans for cutoff (lurker)
MAIN.bans_dups 0 0.00 Bans superseded by other bans
MAIN.bans_lurker_contention 0 0.00 Lurker gave way for lookup
MAIN.bans_persisted_bytes 16 . Bytes used by the persisted ban lists
MAIN.bans_persisted_fragmentation 0 . Extra bytes in persisted ban lists due to fragmentation
MAIN.n_purges 0 . Number of purge operations executed
MAIN.n_obj_purged 0 . Number of purged objects
MAIN.exp_mailed 3272785 13.39 Number of objects mailed to expiry thread
MAIN.exp_received 3272785 13.39 Number of objects received by expiry thread
MAIN.hcb_nolock 10225413 41.83 HCB Lookups without lock
MAIN.hcb_lock 348309 1.42 HCB Lookups with lock
MAIN.hcb_insert 348094 1.42 HCB Inserts
MAIN.esi_errors 0 0.00 ESI parse errors (unlock)
MAIN.esi_warnings 0 0.00 ESI parse warnings (unlock)
MAIN.vmods 1 . Loaded VMODs
MAIN.n_gzip 1262083 5.16 Gzip operations
MAIN.n_gunzip 41886 0.17 Gunzip operations
MAIN.n_test_gunzip 0 0.00 Test gunzip operations
MAIN.vsm_free 973840 . Free VSM space
MAIN.vsm_used 83960768 . Used VSM space
MAIN.vsm_cooling 0 . Cooling VSM space
MAIN.vsm_overflow 0 . Overflow VSM space
MAIN.vsm_overflowed 0 0.00 Overflowed VSM space
MGT.uptime 3145114 12.86 Management process uptime
MGT.child_start 3 0.00 Child process started
MGT.child_exit 0 0.00 Child process normal exit
MGT.child_stop 0 0.00 Child process unexpected exit
MGT.child_died 2 0.00 Child process died (signal)
MGT.child_dump 2 0.00 Child process core dumped
MGT.child_panic 0 0.00 Child process panic
MEMPOOL.busyobj.live 8 . In use
MEMPOOL.busyobj.pool 11 . In Pool
MEMPOOL.busyobj.sz_wanted 65536 . Size requested
MEMPOOL.busyobj.sz_actual 65504 . Size allocated
MEMPOOL.busyobj.allocs 2875324 11.76 Allocations
MEMPOOL.busyobj.frees 2875316 11.76 Frees
MEMPOOL.busyobj.recycle 2862130 11.71 Recycled from pool
MEMPOOL.busyobj.timeout 108162 0.44 Timed out from pool
MEMPOOL.busyobj.toosmall 0 0.00 Too small to recycle
MEMPOOL.busyobj.surplus 0 0.00 Too many for pool
MEMPOOL.busyobj.randry 13194 0.05 Pool ran dry
MEMPOOL.req0.live 3 . In use
MEMPOOL.req0.pool 14 . In Pool
MEMPOOL.req0.sz_wanted 65536 . Size requested
MEMPOOL.req0.sz_actual 65504 . Size allocated
MEMPOOL.req0.allocs 7871304 32.20 Allocations
MEMPOOL.req0.frees 7871301 32.20 Frees
MEMPOOL.req0.recycle 7850975 32.11 Recycled from pool
MEMPOOL.req0.timeout 82821 0.34 Timed out from pool
MEMPOOL.req0.toosmall 0 0.00 Too small to recycle
MEMPOOL.req0.surplus 2158 0.01 Too many for pool
MEMPOOL.req0.randry 20329 0.08 Pool ran dry
MEMPOOL.sess0.live 14 . In use
MEMPOOL.sess0.pool 13 . In Pool
MEMPOOL.sess0.sz_wanted 512 . Size requested
MEMPOOL.sess0.sz_actual 480 . Size allocated
MEMPOOL.sess0.allocs 5407625 22.12 Allocations
MEMPOOL.sess0.frees 5407611 22.12 Frees
MEMPOOL.sess0.recycle 5342462 21.85 Recycled from pool
MEMPOOL.sess0.timeout 184908 0.76 Timed out from pool
MEMPOOL.sess0.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess0.surplus 30850 0.13 Too many for pool
MEMPOOL.sess0.randry 65163 0.27 Pool ran dry
MEMPOOL.req1.live 4 . In use
MEMPOOL.req1.pool 18 . In Pool
MEMPOOL.req1.sz_wanted 65536 . Size requested
MEMPOOL.req1.sz_actual 65504 . Size allocated
MEMPOOL.req1.allocs 7870901 32.20 Allocations
MEMPOOL.req1.frees 7870897 32.20 Frees
MEMPOOL.req1.recycle 7850560 32.11 Recycled from pool
MEMPOOL.req1.timeout 82989 0.34 Timed out from pool
MEMPOOL.req1.toosmall 0 0.00 Too small to recycle
MEMPOOL.req1.surplus 1794 0.01 Too many for pool
MEMPOOL.req1.randry 20341 0.08 Pool ran dry
MEMPOOL.sess1.live 17 . In use
MEMPOOL.sess1.pool 18 . In Pool
MEMPOOL.sess1.sz_wanted 512 . Size requested
MEMPOOL.sess1.sz_actual 480 . Size allocated
MEMPOOL.sess1.allocs 5406276 22.11 Allocations
MEMPOOL.sess1.frees 5406259 22.11 Frees
MEMPOOL.sess1.recycle 5340931 21.85 Recycled from pool
MEMPOOL.sess1.timeout 184777 0.76 Timed out from pool
MEMPOOL.sess1.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess1.surplus 30947 0.13 Too many for pool
MEMPOOL.sess1.randry 65345 0.27 Pool ran dry
SMA.s0.c_req 4309927 17.63 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 127888920949 523122.98 Bytes allocated
SMA.s0.c_freed 127714379518 522409.03 Bytes freed
SMA.s0.g_alloc 3072 . Allocations outstanding
SMA.s0.g_bytes 174541431 . Bytes outstanding
SMA.s0.g_space 8415393161 . Bytes available
SMA.Transient.c_req 3511659 14.36 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 17678767500 72314.08 Bytes allocated
SMA.Transient.c_freed 17678606604 72313.42 Bytes freed
SMA.Transient.g_alloc 196 . Allocations outstanding
SMA.Transient.g_bytes 160896 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.conf-2018-11-06_142400.default.happy 0 . Happy health probes
VBE.conf-2018-11-06_142400.default.bereq_hdrbytes 1601813566 6552.14 Request header bytes
VBE.conf-2018-11-06_142400.default.bereq_bodybytes 197010427 805.86 Request body bytes
VBE.conf-2018-11-06_142400.default.beresp_hdrbytes 1481249422 6058.97 Response header bytes
VBE.conf-2018-11-06_142400.default.beresp_bodybytes 47877555284 195840.65 Response body bytes
VBE.conf-2018-11-06_142400.default.pipe_hdrbytes 2718879 11.12 Pipe request header bytes
VBE.conf-2018-11-06_142400.default.pipe_out 1409866 5.77 Piped bytes to backend
VBE.conf-2018-11-06_142400.default.pipe_in 17882349 73.15 Piped bytes from backend
VBE.conf-2018-11-06_142400.default.conn 8 . Concurrent connections to backend
VBE.conf-2018-11-06_142400.default.req 2875327 11.76 Backend requests sent
LCK.backend.creat 3 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 5750647 23.52 Lock Operations
LCK.backend_tcp.creat 1 0.00 Created locks
LCK.backend_tcp.destroy 0 0.00 Destroyed locks
LCK.backend_tcp.locks 9038321 36.97 Lock Operations
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 5084447 20.80 Lock Operations
LCK.busyobj.creat 3213172 13.14 Created locks
LCK.busyobj.destroy 3214904 13.15 Destroyed locks
LCK.busyobj.locks 30427598 124.46 Lock Operations
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 81496 0.33 Lock Operations
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 13382080 54.74 Lock Operations
LCK.hcb.creat 1 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 696280 2.85 Lock Operations
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 8967041 36.68 Lock Operations
LCK.mempool.creat 5 0.00 Created locks
LCK.mempool.destroy 0 0.00 Destroyed locks
LCK.mempool.locks 60929423 249.23 Lock Operations
LCK.objhdr.creat 349642 1.43 Created locks
LCK.objhdr.destroy 346251 1.42 Destroyed locks
LCK.objhdr.locks 75491395 308.79 Lock Operations
LCK.pipestat.creat 1 0.00 Created locks
LCK.pipestat.destroy 0 0.00 Destroyed locks
LCK.pipestat.locks 1724 0.01 Lock Operations
LCK.sess.creat 10813321 44.23 Created locks
LCK.sess.destroy 10813836 44.23 Destroyed locks
LCK.sess.locks 16814084 68.78 Lock Operations
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 81489 0.33 Lock Operations
LCK.vcapace.creat 1 0.00 Created locks
LCK.vcapace.destroy 0 0.00 Destroyed locks
LCK.vcapace.locks 0 0.00 Lock Operations
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 5815522 23.79 Lock Operations
LCK.vxid.creat 1 0.00 Created locks
LCK.vxid.destroy 0 0.00 Destroyed locks
LCK.vxid.locks 2346 0.01 Lock Operations
LCK.waiter.creat 2 0.00 Created locks
LCK.waiter.destroy 0 0.00 Destroyed locks
LCK.waiter.locks 19733708 80.72 Lock Operations
LCK.wq.creat 3 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 56100309 229.48 Lock Operations
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 21802570 89.18 Lock Operations
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 15639885 63.97 Lock Operations

Thanks !

Marc Fournier
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: 5.1.3: unexplained lockup during several minutes [ In reply to ]
Is /var/lib/varnish mounted as tmpfs?

--
Guillaume Quintard


On Fri, Nov 9, 2018 at 5:25 PM Marc Fournier <marc.fournier@camptocamp.com>
wrote:

>
> Hello,
>
> I'm struggling figuring out a problem that occurs rarely, and random
> times, on different machines with the same setup (so excluding hardware
> problems). This leads to varnish not answering any requests anymore for
> several minutes.
>
> Running on a 8 CPU amazon EC2 instance with the following parameters:
> varnishd -F -f /etc/varnish/config.vcl -I /etc/varnish/settings.cli -n
> varnishd -s malloc,8G -a 127.0.0.1:8080,PROXY
>
> When the lockup occurs, the loadavg increases from 0 to about 20 and 2
> out of 8 CPUs are 100% in iowait mode (maybe because I use 2 threadpools ?)
>
> There is plenty of free memory, and no disk IO activity, nothing in
> dmesg. Apart from varnish, there's haproxy and a couple of varnishlog
> and other monitoring tools running on these machines.
>
> The 2 non-default settings I set are:
> thread_pool_min 1000
> timeout_idle 45
>
> On specificity: the backends set quite short TTLs on the objects they
> produce (less than 10s), but grace time is set to 300s, so I was
> wondering is there might be a side-effect with the Transient store.
>
> Any hint or clue would be much welcome !
>
> Below are the log messages and the output of varnishstat -1:
>
> 2018-10-24T02:40:27.455363000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:40:27.455899000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:41:27.512111000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:41:27.512321000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:42:27.571685000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:42:27.571883000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:43:27.598954000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:43:27.599156000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:44:27.645895000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:44:27.646108000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:45:03.252791000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:45:03.253069000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-10-24T02:45:03.253274000Z Error: Child (14) not responding to CLI,
> killed it.
> 2018-10-24T02:45:03.253465000Z Error: Unexpected reply from ping: 400 CLI
> communication error
> 2018-10-24T02:45:03.253769000Z Error: Child (14) died signal=3 (core
> dumped)
> 2018-10-24T02:45:03.254007000Z Debug: Child cleanup complete
> 2018-10-24T02:45:03.255242000Z Debug: Child (2202) Started
> 2018-10-24T02:45:03.339992000Z Info: Child (2202) said Child starts
>
> 2018-11-06T20:03:29.675883000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:03:29.676394000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:04:29.701217000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:04:29.701422000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:05:29.758068000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:05:29.758269000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:06:29.781659000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:06:29.781867000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:07:29.804125000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:07:29.804336000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:08:29.864111000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:08:29.864318000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:09:29.873115000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:09:29.873321000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:10:29.932102000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:10:29.932307000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:11:29.992180000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:11:29.992387000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:12:30.052123000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:12:30.052328000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:13:30.112074000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:13:30.112275000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:14:30.172123000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:14:30.172326000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:15:03.116961000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:15:03.117291000Z Error: Unexpected reply from ping: 400 CLI
> communication error (hdr)
> 2018-11-06T20:15:03.117517000Z Error: Child (2202) not responding to CLI,
> killed it.
> 2018-11-06T20:15:03.117747000Z Error: Unexpected reply from ping: 400 CLI
> communication error
> 2018-11-06T20:15:03.119003000Z Error: Child (2202) died signal=3 (core
> dumped)
> 2018-11-06T20:15:03.119377000Z Debug: Child cleanup complete
> 2018-11-06T20:15:03.120560000Z Debug: Child (4327) Started
> 2018-11-06T20:15:03.207214000Z Info: Child (4327) said Child starts
>
> MAIN.uptime 244472 1.00 Child process uptime
> MAIN.sess_conn 10813893 44.23 Sessions accepted
> MAIN.sess_drop 0 0.00 Sessions dropped
> MAIN.sess_fail 0 0.00 Session accept failures
> MAIN.client_req_400 5 0.00 Client requests received,
> subject to 400 errors
> MAIN.client_req_417 0 0.00 Client requests received,
> subject to 417 errors
> MAIN.client_req 11329721 46.34 Good client requests received
> MAIN.cache_hit 9785347 40.03 Cache hits
> MAIN.cache_hitpass 0 0.00 Cache hits for pass.
> MAIN.cache_hitmiss 87463 0.36 Cache hits for miss.
> MAIN.cache_miss 439975 1.80 Cache misses
> MAIN.backend_conn 1233017 5.04 Backend conn. success
> MAIN.backend_unhealthy 0 0.00 Backend conn. not
> attempted
> MAIN.backend_busy 0 0.00 Backend conn. too many
> MAIN.backend_fail 0 0.00 Backend conn. failures
> MAIN.backend_reuse 1641618 6.71 Backend conn. reuses
> MAIN.backend_recycle 1647099 6.74 Backend conn. recycles
> MAIN.backend_retry 4 0.00 Backend conn. retry
> MAIN.fetch_head 6 0.00 Fetch no body (HEAD)
> MAIN.fetch_length 264385 1.08 Fetch with Length
> MAIN.fetch_chunked 15594 0.06 Fetch chunked
> MAIN.fetch_eof 1226485 5.02 Fetch EOF
> MAIN.fetch_bad 0 0.00 Fetch bad T-E
> MAIN.fetch_none 1 0.00 Fetch no body
> MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
> MAIN.fetch_204 8 0.00 Fetch no body (204)
> MAIN.fetch_304 1367111 5.59 Fetch no body (304)
> MAIN.fetch_failed 2 0.00 Fetch failed (all causes)
> MAIN.fetch_no_thread 0 0.00 Fetch failed (no thread)
> MAIN.pools 2 . Number of thread pools
> MAIN.threads 2000 . Total number of threads
> MAIN.threads_limited 0 0.00 Threads hit max
> MAIN.threads_created 2012 0.01 Threads created
> MAIN.threads_destroyed 12 0.00 Threads destroyed
> MAIN.threads_failed 0 0.00 Thread creation failed
> MAIN.thread_queue_len 0 . Length of session queue
> MAIN.busy_sleep 3558 0.01 Number of requests sent
> to sleep on busy objhdr
> MAIN.busy_wakeup 3558 0.01 Number of requests woken
> after sleep on busy objhdr
> MAIN.busy_killed 0 0.00 Number of requests killed
> after sleep on busy objhdr
> MAIN.sess_queued 1288 0.01 Sessions queued for thread
> MAIN.sess_dropped 0 0.00 Sessions dropped for
> thread
> MAIN.n_object 1645 . object structs made
> MAIN.n_vampireobject 0 . unresurrected objects
> MAIN.n_objectcore 3407 . objectcore structs made
> MAIN.n_objecthead 3649 . objecthead structs made
> MAIN.n_backend 2 . Number of backends
> MAIN.n_expired 438348 . Number of expired objects
> MAIN.n_lru_nuked 0 . Number of LRU nuked
> objects
> MAIN.n_lru_moved 5255908 . Number of LRU moved
> objects
> MAIN.losthdr 0 0.00 HTTP header overflows
> MAIN.s_sess 10813893 44.23 Total sessions seen
> MAIN.s_req 11329721 46.34 Total requests seen
> MAIN.s_pipe 1724 0.01 Total pipe sessions seen
> MAIN.s_pass 1017203 4.16 Total pass-ed requests
> seen
> MAIN.s_fetch 1457178 5.96 Total backend fetches
> initiated
> MAIN.s_synth 85381 0.35 Total synthethic
> responses made
> MAIN.s_req_hdrbytes 6238979422 25520.22 Request header bytes
> MAIN.s_req_bodybytes 197010427 805.86 Request body bytes
> MAIN.s_resp_hdrbytes 6906180092 28249.37 Response header bytes
> MAIN.s_resp_bodybytes 1332774227295 5451643.65 Response body bytes
> MAIN.s_pipe_hdrbytes 2686309 10.99 Pipe request header bytes
> MAIN.s_pipe_in 1409866 5.77 Piped bytes from client
> MAIN.s_pipe_out 17882349 73.15 Piped bytes to client
> MAIN.sess_closed 33230 0.14 Session Closed
> MAIN.sess_closed_err 5187 0.02 Session Closed with error
> MAIN.sess_readahead 0 0.00 Session Read Ahead
> MAIN.sess_herd 4943201 20.22 Session herd
> MAIN.sc_rem_close 10784144 44.11 Session OK REM_CLOSE
> MAIN.sc_req_close 21829 0.09 Session OK REQ_CLOSE
> MAIN.sc_req_http10 932 0.00 Session Err REQ_HTTP10
> MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
> MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
> MAIN.sc_rx_junk 5 0.00 Session Err RX_JUNK
> MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
> MAIN.sc_rx_timeout 4250 0.02 Session Err RX_TIMEOUT
> MAIN.sc_tx_pipe 1724 0.01 Session OK TX_PIPE
> MAIN.sc_tx_error 0 0.00 Session Err TX_ERROR
> MAIN.sc_tx_eof 543 0.00 Session OK TX_EOF
> MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
> MAIN.sc_overload 0 0.00 Session Err OVERLOAD
> MAIN.sc_pipe_overflow 0 0.00 Session Err PIPE_OVERFLOW
> MAIN.sc_range_short 0 0.00 Session Err RANGE_SHORT
> MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
> MAIN.sc_vcl_failure 0 0.00 Session Err VCL_FAILURE
> MAIN.shm_records 921799607 3770.57 SHM records
> MAIN.shm_writes 82947584 339.29 SHM writes
> MAIN.shm_flushes 1821 0.01 SHM flushes due to
> overflow
> MAIN.shm_cont 961287 3.93 SHM MTX contention
> MAIN.shm_cycles 436 0.00 SHM cycles through buffer
> MAIN.backend_req 2862111 11.71 Backend requests made
> MAIN.n_vcl 2 0.00 Number of loaded VCLs in
> total
> MAIN.n_vcl_avail 2 0.00 Number of VCLs available
> MAIN.n_vcl_discard 0 0.00 Number of discarded VCLs
> MAIN.vcl_fail 0 0.00 VCL failures
> MAIN.bans 1 . Count of bans
> MAIN.bans_completed 1 . Number of bans marked
> 'completed'
> MAIN.bans_obj 0 . Number of bans using obj.*
> MAIN.bans_req 0 . Number of bans using req.*
> MAIN.bans_added 1 0.00 Bans added
> MAIN.bans_deleted 0 0.00 Bans deleted
> MAIN.bans_tested 0 0.00 Bans tested against
> objects (lookup)
> MAIN.bans_obj_killed 0 0.00 Objects killed by bans
> (lookup)
> MAIN.bans_lurker_tested 0 0.00 Bans tested against
> objects (lurker)
> MAIN.bans_tests_tested 0 0.00 Ban tests tested against
> objects (lookup)
> MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested
> against objects (lurker)
> MAIN.bans_lurker_obj_killed 0 0.00 Objects killed by
> bans (lurker)
> MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects
> killed by bans for cutoff (lurker)
> MAIN.bans_dups 0 0.00 Bans
> superseded by other bans
> MAIN.bans_lurker_contention 0 0.00 Lurker gave
> way for lookup
> MAIN.bans_persisted_bytes 16 . Bytes used by
> the persisted ban lists
> MAIN.bans_persisted_fragmentation 0 . Extra bytes
> in persisted ban lists due to fragmentation
> MAIN.n_purges 0 . Number of
> purge operations executed
> MAIN.n_obj_purged 0 . Number of
> purged objects
> MAIN.exp_mailed 3272785 13.39 Number of
> objects mailed to expiry thread
> MAIN.exp_received 3272785 13.39 Number of
> objects received by expiry thread
> MAIN.hcb_nolock 10225413 41.83 HCB Lookups
> without lock
> MAIN.hcb_lock 348309 1.42 HCB Lookups
> with lock
> MAIN.hcb_insert 348094 1.42 HCB Inserts
> MAIN.esi_errors 0 0.00 ESI parse
> errors (unlock)
> MAIN.esi_warnings 0 0.00 ESI parse
> warnings (unlock)
> MAIN.vmods 1 . Loaded VMODs
> MAIN.n_gzip 1262083 5.16 Gzip
> operations
> MAIN.n_gunzip 41886 0.17 Gunzip
> operations
> MAIN.n_test_gunzip 0 0.00 Test gunzip
> operations
> MAIN.vsm_free 973840 . Free VSM space
> MAIN.vsm_used 83960768 . Used VSM space
> MAIN.vsm_cooling 0 . Cooling VSM
> space
> MAIN.vsm_overflow 0 . Overflow VSM
> space
> MAIN.vsm_overflowed 0 0.00 Overflowed
> VSM space
> MGT.uptime 3145114 12.86 Management
> process uptime
> MGT.child_start 3 0.00 Child process
> started
> MGT.child_exit 0 0.00 Child process
> normal exit
> MGT.child_stop 0 0.00 Child process
> unexpected exit
> MGT.child_died 2 0.00 Child process
> died (signal)
> MGT.child_dump 2 0.00 Child process
> core dumped
> MGT.child_panic 0 0.00 Child process
> panic
> MEMPOOL.busyobj.live 8 . In use
> MEMPOOL.busyobj.pool 11 . In Pool
> MEMPOOL.busyobj.sz_wanted 65536 . Size requested
> MEMPOOL.busyobj.sz_actual 65504 . Size allocated
> MEMPOOL.busyobj.allocs 2875324 11.76 Allocations
> MEMPOOL.busyobj.frees 2875316 11.76 Frees
> MEMPOOL.busyobj.recycle 2862130 11.71 Recycled from
> pool
> MEMPOOL.busyobj.timeout 108162 0.44 Timed out
> from pool
> MEMPOOL.busyobj.toosmall 0 0.00 Too small to
> recycle
> MEMPOOL.busyobj.surplus 0 0.00 Too many for
> pool
> MEMPOOL.busyobj.randry 13194 0.05 Pool ran dry
> MEMPOOL.req0.live 3 . In use
> MEMPOOL.req0.pool 14 . In Pool
> MEMPOOL.req0.sz_wanted 65536 . Size requested
> MEMPOOL.req0.sz_actual 65504 . Size allocated
> MEMPOOL.req0.allocs 7871304 32.20 Allocations
> MEMPOOL.req0.frees 7871301 32.20 Frees
> MEMPOOL.req0.recycle 7850975 32.11 Recycled from
> pool
> MEMPOOL.req0.timeout 82821 0.34 Timed out
> from pool
> MEMPOOL.req0.toosmall 0 0.00 Too small to
> recycle
> MEMPOOL.req0.surplus 2158 0.01 Too many for
> pool
> MEMPOOL.req0.randry 20329 0.08 Pool ran dry
> MEMPOOL.sess0.live 14 . In use
> MEMPOOL.sess0.pool 13 . In Pool
> MEMPOOL.sess0.sz_wanted 512 . Size requested
> MEMPOOL.sess0.sz_actual 480 . Size allocated
> MEMPOOL.sess0.allocs 5407625 22.12 Allocations
> MEMPOOL.sess0.frees 5407611 22.12 Frees
> MEMPOOL.sess0.recycle 5342462 21.85 Recycled from
> pool
> MEMPOOL.sess0.timeout 184908 0.76 Timed out
> from pool
> MEMPOOL.sess0.toosmall 0 0.00 Too small to
> recycle
> MEMPOOL.sess0.surplus 30850 0.13 Too many for
> pool
> MEMPOOL.sess0.randry 65163 0.27 Pool ran dry
> MEMPOOL.req1.live 4 . In use
> MEMPOOL.req1.pool 18 . In Pool
> MEMPOOL.req1.sz_wanted 65536 . Size requested
> MEMPOOL.req1.sz_actual 65504 . Size allocated
> MEMPOOL.req1.allocs 7870901 32.20 Allocations
> MEMPOOL.req1.frees 7870897 32.20 Frees
> MEMPOOL.req1.recycle 7850560 32.11 Recycled from
> pool
> MEMPOOL.req1.timeout 82989 0.34 Timed out
> from pool
> MEMPOOL.req1.toosmall 0 0.00 Too small to
> recycle
> MEMPOOL.req1.surplus 1794 0.01 Too many for
> pool
> MEMPOOL.req1.randry 20341 0.08 Pool ran dry
> MEMPOOL.sess1.live 17 . In use
> MEMPOOL.sess1.pool 18 . In Pool
> MEMPOOL.sess1.sz_wanted 512 . Size requested
> MEMPOOL.sess1.sz_actual 480 . Size allocated
> MEMPOOL.sess1.allocs 5406276 22.11 Allocations
> MEMPOOL.sess1.frees 5406259 22.11 Frees
> MEMPOOL.sess1.recycle 5340931 21.85 Recycled from
> pool
> MEMPOOL.sess1.timeout 184777 0.76 Timed out
> from pool
> MEMPOOL.sess1.toosmall 0 0.00 Too small to
> recycle
> MEMPOOL.sess1.surplus 30947 0.13 Too many for
> pool
> MEMPOOL.sess1.randry 65345 0.27 Pool ran dry
> SMA.s0.c_req 4309927 17.63 Allocator
> requests
> SMA.s0.c_fail 0 0.00 Allocator
> failures
> SMA.s0.c_bytes 127888920949 523122.98 Bytes
> allocated
> SMA.s0.c_freed 127714379518 522409.03 Bytes freed
> SMA.s0.g_alloc 3072 . Allocations
> outstanding
> SMA.s0.g_bytes 174541431 . Bytes
> outstanding
> SMA.s0.g_space 8415393161 . Bytes
> available
> SMA.Transient.c_req 3511659 14.36 Allocator
> requests
> SMA.Transient.c_fail 0 0.00 Allocator
> failures
> SMA.Transient.c_bytes 17678767500 72314.08 Bytes
> allocated
> SMA.Transient.c_freed 17678606604 72313.42 Bytes freed
> SMA.Transient.g_alloc 196 . Allocations
> outstanding
> SMA.Transient.g_bytes 160896 . Bytes
> outstanding
> SMA.Transient.g_space 0 . Bytes
> available
> VBE.conf-2018-11-06_142400.default.happy 0 . Happy
> health probes
> VBE.conf-2018-11-06_142400.default.bereq_hdrbytes 1601813566
> 6552.14 Request header bytes
> VBE.conf-2018-11-06_142400.default.bereq_bodybytes 197010427
> 805.86 Request body bytes
> VBE.conf-2018-11-06_142400.default.beresp_hdrbytes 1481249422
> 6058.97 Response header bytes
> VBE.conf-2018-11-06_142400.default.beresp_bodybytes 47877555284
> 195840.65 Response body bytes
> VBE.conf-2018-11-06_142400.default.pipe_hdrbytes 2718879
> 11.12 Pipe request header bytes
> VBE.conf-2018-11-06_142400.default.pipe_out 1409866
> 5.77 Piped bytes to backend
> VBE.conf-2018-11-06_142400.default.pipe_in 17882349
> 73.15 Piped bytes from backend
> VBE.conf-2018-11-06_142400.default.conn 8
> . Concurrent connections to backend
> VBE.conf-2018-11-06_142400.default.req 2875327
> 11.76 Backend requests sent
> LCK.backend.creat 3
> 0.00 Created locks
> LCK.backend.destroy 0
> 0.00 Destroyed locks
> LCK.backend.locks 5750647
> 23.52 Lock Operations
> LCK.backend_tcp.creat 1
> 0.00 Created locks
> LCK.backend_tcp.destroy 0
> 0.00 Destroyed locks
> LCK.backend_tcp.locks 9038321
> 36.97 Lock Operations
> LCK.ban.creat 1
> 0.00 Created locks
> LCK.ban.destroy 0
> 0.00 Destroyed locks
> LCK.ban.locks 5084447
> 20.80 Lock Operations
> LCK.busyobj.creat 3213172
> 13.14 Created locks
> LCK.busyobj.destroy 3214904
> 13.15 Destroyed locks
> LCK.busyobj.locks 30427598
> 124.46 Lock Operations
> LCK.cli.creat 1
> 0.00 Created locks
> LCK.cli.destroy 0
> 0.00 Destroyed locks
> LCK.cli.locks 81496
> 0.33 Lock Operations
> LCK.exp.creat 1
> 0.00 Created locks
> LCK.exp.destroy 0
> 0.00 Destroyed locks
> LCK.exp.locks 13382080
> 54.74 Lock Operations
> LCK.hcb.creat 1
> 0.00 Created locks
> LCK.hcb.destroy 0
> 0.00 Destroyed locks
> LCK.hcb.locks 696280
> 2.85 Lock Operations
> LCK.lru.creat 2
> 0.00 Created locks
> LCK.lru.destroy 0
> 0.00 Destroyed locks
> LCK.lru.locks 8967041
> 36.68 Lock Operations
> LCK.mempool.creat 5
> 0.00 Created locks
> LCK.mempool.destroy 0
> 0.00 Destroyed locks
> LCK.mempool.locks 60929423
> 249.23 Lock Operations
> LCK.objhdr.creat 349642
> 1.43 Created locks
> LCK.objhdr.destroy 346251
> 1.42 Destroyed locks
> LCK.objhdr.locks 75491395
> 308.79 Lock Operations
> LCK.pipestat.creat 1
> 0.00 Created locks
> LCK.pipestat.destroy 0
> 0.00 Destroyed locks
> LCK.pipestat.locks 1724
> 0.01 Lock Operations
> LCK.sess.creat 10813321
> 44.23 Created locks
> LCK.sess.destroy 10813836
> 44.23 Destroyed locks
> LCK.sess.locks 16814084
> 68.78 Lock Operations
> LCK.vbe.creat 1
> 0.00 Created locks
> LCK.vbe.destroy 0
> 0.00 Destroyed locks
> LCK.vbe.locks 81489
> 0.33 Lock Operations
> LCK.vcapace.creat 1
> 0.00 Created locks
> LCK.vcapace.destroy 0
> 0.00 Destroyed locks
> LCK.vcapace.locks 0
> 0.00 Lock Operations
> LCK.vcl.creat 1
> 0.00 Created locks
> LCK.vcl.destroy 0
> 0.00 Destroyed locks
> LCK.vcl.locks 5815522
> 23.79 Lock Operations
> LCK.vxid.creat 1
> 0.00 Created locks
> LCK.vxid.destroy 0
> 0.00 Destroyed locks
> LCK.vxid.locks 2346
> 0.01 Lock Operations
> LCK.waiter.creat 2
> 0.00 Created locks
> LCK.waiter.destroy 0
> 0.00 Destroyed locks
> LCK.waiter.locks 19733708
> 80.72 Lock Operations
> LCK.wq.creat 3
> 0.00 Created locks
> LCK.wq.destroy 0
> 0.00 Destroyed locks
> LCK.wq.locks 56100309
> 229.48 Lock Operations
> LCK.wstat.creat 1
> 0.00 Created locks
> LCK.wstat.destroy 0
> 0.00 Destroyed locks
> LCK.wstat.locks 21802570
> 89.18 Lock Operations
> LCK.sma.creat 2
> 0.00 Created locks
> LCK.sma.destroy 0
> 0.00 Destroyed locks
> LCK.sma.locks 15639885
> 63.97 Lock Operations
>
> Thanks !
>
> Marc Fournier
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: 5.1.3: unexplained lockup during several minutes [ In reply to ]
adding varnish-misc back

Try mounting it as tmpfs (with exec rights), this files will get a log of
IO, so one usual suspects is that you are killing your filesystem with them
--
Guillaume Quintard


On Mon, Dec 10, 2018 at 4:48 PM Marc Fournier <marc.fournier@camptocamp.com>
wrote:

>
> Hello Guillaume,
>
> No, the binaries are on an overlay2 filesystem (docker container), but
> /var/lib/varnish itself is bound to a directory on the host's
> filesystem, which is ext4.
>
> This machine does run in a VM (amazon EC2) though. But nothing in dmesg
> indicates any OS- or hardware level problem.
>
> Thanks !
>
> Marc
>
> Guillaume Quintard <guillaume@varnish-software.com> writes:
>
> > Is /var/lib/varnish mounted as tmpfs?
> >
> > --
> > Guillaume Quintard
> >
> >
> > On Fri, Nov 9, 2018 at 5:25 PM Marc Fournier <
> marc.fournier@camptocamp.com>
> > wrote:
> >
> >>
> >> Hello,
> >>
> >> I'm struggling figuring out a problem that occurs rarely, and random
> >> times, on different machines with the same setup (so excluding hardware
> >> problems). This leads to varnish not answering any requests anymore for
> >> several minutes.
> >>
> >> Running on a 8 CPU amazon EC2 instance with the following parameters:
> >> varnishd -F -f /etc/varnish/config.vcl -I /etc/varnish/settings.cli -n
> >> varnishd -s malloc,8G -a 127.0.0.1:8080,PROXY
> >>
> >> When the lockup occurs, the loadavg increases from 0 to about 20 and 2
> >> out of 8 CPUs are 100% in iowait mode (maybe because I use 2
> threadpools ?)
> >>
> >> There is plenty of free memory, and no disk IO activity, nothing in
> >> dmesg. Apart from varnish, there's haproxy and a couple of varnishlog
> >> and other monitoring tools running on these machines.
> >>
> >> The 2 non-default settings I set are:
> >> thread_pool_min 1000
> >> timeout_idle 45
> >>
> >> On specificity: the backends set quite short TTLs on the objects they
> >> produce (less than 10s), but grace time is set to 300s, so I was
> >> wondering is there might be a side-effect with the Transient store.
> >>
> >> Any hint or clue would be much welcome !
> >>
> >> Below are the log messages and the output of varnishstat -1:
> >>
> >> 2018-10-24T02:40:27.455363000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:40:27.455899000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:41:27.512111000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:41:27.512321000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:42:27.571685000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:42:27.571883000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:43:27.598954000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:43:27.599156000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:44:27.645895000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:44:27.646108000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:45:03.252791000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:45:03.253069000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-10-24T02:45:03.253274000Z Error: Child (14) not responding to CLI,
> >> killed it.
> >> 2018-10-24T02:45:03.253465000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error
> >> 2018-10-24T02:45:03.253769000Z Error: Child (14) died signal=3 (core
> >> dumped)
> >> 2018-10-24T02:45:03.254007000Z Debug: Child cleanup complete
> >> 2018-10-24T02:45:03.255242000Z Debug: Child (2202) Started
> >> 2018-10-24T02:45:03.339992000Z Info: Child (2202) said Child starts
> >>
> >> 2018-11-06T20:03:29.675883000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:03:29.676394000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:04:29.701217000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:04:29.701422000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:05:29.758068000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:05:29.758269000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:06:29.781659000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:06:29.781867000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:07:29.804125000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:07:29.804336000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:08:29.864111000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:08:29.864318000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:09:29.873115000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:09:29.873321000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:10:29.932102000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:10:29.932307000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:11:29.992180000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:11:29.992387000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:12:30.052123000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:12:30.052328000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:13:30.112074000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:13:30.112275000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:14:30.172123000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:14:30.172326000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:15:03.116961000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:15:03.117291000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error (hdr)
> >> 2018-11-06T20:15:03.117517000Z Error: Child (2202) not responding to
> CLI,
> >> killed it.
> >> 2018-11-06T20:15:03.117747000Z Error: Unexpected reply from ping: 400
> CLI
> >> communication error
> >> 2018-11-06T20:15:03.119003000Z Error: Child (2202) died signal=3 (core
> >> dumped)
> >> 2018-11-06T20:15:03.119377000Z Debug: Child cleanup complete
> >> 2018-11-06T20:15:03.120560000Z Debug: Child (4327) Started
> >> 2018-11-06T20:15:03.207214000Z Info: Child (4327) said Child starts
> >>
> >> MAIN.uptime 244472 1.00 Child process uptime
> >> MAIN.sess_conn 10813893 44.23 Sessions accepted
> >> MAIN.sess_drop 0 0.00 Sessions dropped
> >> MAIN.sess_fail 0 0.00 Session accept failures
> >> MAIN.client_req_400 5 0.00 Client requests received,
> >> subject to 400 errors
> >> MAIN.client_req_417 0 0.00 Client requests received,
> >> subject to 417 errors
> >> MAIN.client_req 11329721 46.34 Good client requests
> received
> >> MAIN.cache_hit 9785347 40.03 Cache hits
> >> MAIN.cache_hitpass 0 0.00 Cache hits for pass.
> >> MAIN.cache_hitmiss 87463 0.36 Cache hits for miss.
> >> MAIN.cache_miss 439975 1.80 Cache misses
> >> MAIN.backend_conn 1233017 5.04 Backend conn. success
> >> MAIN.backend_unhealthy 0 0.00 Backend conn. not
> >> attempted
> >> MAIN.backend_busy 0 0.00 Backend conn. too many
> >> MAIN.backend_fail 0 0.00 Backend conn. failures
> >> MAIN.backend_reuse 1641618 6.71 Backend conn. reuses
> >> MAIN.backend_recycle 1647099 6.74 Backend conn. recycles
> >> MAIN.backend_retry 4 0.00 Backend conn. retry
> >> MAIN.fetch_head 6 0.00 Fetch no body (HEAD)
> >> MAIN.fetch_length 264385 1.08 Fetch with Length
> >> MAIN.fetch_chunked 15594 0.06 Fetch chunked
> >> MAIN.fetch_eof 1226485 5.02 Fetch EOF
> >> MAIN.fetch_bad 0 0.00 Fetch bad T-E
> >> MAIN.fetch_none 1 0.00 Fetch no body
> >> MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
> >> MAIN.fetch_204 8 0.00 Fetch no body (204)
> >> MAIN.fetch_304 1367111 5.59 Fetch no body (304)
> >> MAIN.fetch_failed 2 0.00 Fetch failed (all
> causes)
> >> MAIN.fetch_no_thread 0 0.00 Fetch failed (no
> thread)
> >> MAIN.pools 2 . Number of thread pools
> >> MAIN.threads 2000 . Total number of threads
> >> MAIN.threads_limited 0 0.00 Threads hit max
> >> MAIN.threads_created 2012 0.01 Threads created
> >> MAIN.threads_destroyed 12 0.00 Threads destroyed
> >> MAIN.threads_failed 0 0.00 Thread creation failed
> >> MAIN.thread_queue_len 0 . Length of session queue
> >> MAIN.busy_sleep 3558 0.01 Number of requests sent
> >> to sleep on busy objhdr
> >> MAIN.busy_wakeup 3558 0.01 Number of requests
> woken
> >> after sleep on busy objhdr
> >> MAIN.busy_killed 0 0.00 Number of requests
> killed
> >> after sleep on busy objhdr
> >> MAIN.sess_queued 1288 0.01 Sessions queued for
> thread
> >> MAIN.sess_dropped 0 0.00 Sessions dropped for
> >> thread
> >> MAIN.n_object 1645 . object structs made
> >> MAIN.n_vampireobject 0 . unresurrected objects
> >> MAIN.n_objectcore 3407 . objectcore structs made
> >> MAIN.n_objecthead 3649 . objecthead structs made
> >> MAIN.n_backend 2 . Number of backends
> >> MAIN.n_expired 438348 . Number of expired
> objects
> >> MAIN.n_lru_nuked 0 . Number of LRU nuked
> >> objects
> >> MAIN.n_lru_moved 5255908 . Number of LRU moved
> >> objects
> >> MAIN.losthdr 0 0.00 HTTP header overflows
> >> MAIN.s_sess 10813893 44.23 Total sessions seen
> >> MAIN.s_req 11329721 46.34 Total requests seen
> >> MAIN.s_pipe 1724 0.01 Total pipe sessions
> seen
> >> MAIN.s_pass 1017203 4.16 Total pass-ed requests
> >> seen
> >> MAIN.s_fetch 1457178 5.96 Total backend fetches
> >> initiated
> >> MAIN.s_synth 85381 0.35 Total synthethic
> >> responses made
> >> MAIN.s_req_hdrbytes 6238979422 25520.22 Request header bytes
> >> MAIN.s_req_bodybytes 197010427 805.86 Request body bytes
> >> MAIN.s_resp_hdrbytes 6906180092 28249.37 Response header bytes
> >> MAIN.s_resp_bodybytes 1332774227295 5451643.65 Response body bytes
> >> MAIN.s_pipe_hdrbytes 2686309 10.99 Pipe request header
> bytes
> >> MAIN.s_pipe_in 1409866 5.77 Piped bytes from client
> >> MAIN.s_pipe_out 17882349 73.15 Piped bytes to client
> >> MAIN.sess_closed 33230 0.14 Session Closed
> >> MAIN.sess_closed_err 5187 0.02 Session Closed with
> error
> >> MAIN.sess_readahead 0 0.00 Session Read Ahead
> >> MAIN.sess_herd 4943201 20.22 Session herd
> >> MAIN.sc_rem_close 10784144 44.11 Session OK REM_CLOSE
> >> MAIN.sc_req_close 21829 0.09 Session OK REQ_CLOSE
> >> MAIN.sc_req_http10 932 0.00 Session Err REQ_HTTP10
> >> MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
> >> MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
> >> MAIN.sc_rx_junk 5 0.00 Session Err RX_JUNK
> >> MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
> >> MAIN.sc_rx_timeout 4250 0.02 Session Err RX_TIMEOUT
> >> MAIN.sc_tx_pipe 1724 0.01 Session OK TX_PIPE
> >> MAIN.sc_tx_error 0 0.00 Session Err TX_ERROR
> >> MAIN.sc_tx_eof 543 0.00 Session OK TX_EOF
> >> MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
> >> MAIN.sc_overload 0 0.00 Session Err OVERLOAD
> >> MAIN.sc_pipe_overflow 0 0.00 Session Err
> PIPE_OVERFLOW
> >> MAIN.sc_range_short 0 0.00 Session Err RANGE_SHORT
> >> MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
> >> MAIN.sc_vcl_failure 0 0.00 Session Err VCL_FAILURE
> >> MAIN.shm_records 921799607 3770.57 SHM records
> >> MAIN.shm_writes 82947584 339.29 SHM writes
> >> MAIN.shm_flushes 1821 0.01 SHM flushes due to
> >> overflow
> >> MAIN.shm_cont 961287 3.93 SHM MTX contention
> >> MAIN.shm_cycles 436 0.00 SHM cycles through
> buffer
> >> MAIN.backend_req 2862111 11.71 Backend requests made
> >> MAIN.n_vcl 2 0.00 Number of loaded VCLs
> in
> >> total
> >> MAIN.n_vcl_avail 2 0.00 Number of VCLs
> available
> >> MAIN.n_vcl_discard 0 0.00 Number of discarded
> VCLs
> >> MAIN.vcl_fail 0 0.00 VCL failures
> >> MAIN.bans 1 . Count of bans
> >> MAIN.bans_completed 1 . Number of bans marked
> >> 'completed'
> >> MAIN.bans_obj 0 . Number of bans using
> obj.*
> >> MAIN.bans_req 0 . Number of bans using
> req.*
> >> MAIN.bans_added 1 0.00 Bans added
> >> MAIN.bans_deleted 0 0.00 Bans deleted
> >> MAIN.bans_tested 0 0.00 Bans tested against
> >> objects (lookup)
> >> MAIN.bans_obj_killed 0 0.00 Objects killed by bans
> >> (lookup)
> >> MAIN.bans_lurker_tested 0 0.00 Bans tested against
> >> objects (lurker)
> >> MAIN.bans_tests_tested 0 0.00 Ban tests tested
> against
> >> objects (lookup)
> >> MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested
> >> against objects (lurker)
> >> MAIN.bans_lurker_obj_killed 0 0.00 Objects killed
> by
> >> bans (lurker)
> >> MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects
> >> killed by bans for cutoff (lurker)
> >> MAIN.bans_dups 0 0.00 Bans
> >> superseded by other bans
> >> MAIN.bans_lurker_contention 0 0.00 Lurker gave
> >> way for lookup
> >> MAIN.bans_persisted_bytes 16 . Bytes used
> by
> >> the persisted ban lists
> >> MAIN.bans_persisted_fragmentation 0 . Extra bytes
> >> in persisted ban lists due to fragmentation
> >> MAIN.n_purges 0 . Number of
> >> purge operations executed
> >> MAIN.n_obj_purged 0 . Number of
> >> purged objects
> >> MAIN.exp_mailed 3272785 13.39 Number of
> >> objects mailed to expiry thread
> >> MAIN.exp_received 3272785 13.39 Number of
> >> objects received by expiry thread
> >> MAIN.hcb_nolock 10225413 41.83 HCB Lookups
> >> without lock
> >> MAIN.hcb_lock 348309 1.42 HCB Lookups
> >> with lock
> >> MAIN.hcb_insert 348094 1.42 HCB Inserts
> >> MAIN.esi_errors 0 0.00 ESI parse
> >> errors (unlock)
> >> MAIN.esi_warnings 0 0.00 ESI parse
> >> warnings (unlock)
> >> MAIN.vmods 1 . Loaded
> VMODs
> >> MAIN.n_gzip 1262083 5.16 Gzip
> >> operations
> >> MAIN.n_gunzip 41886 0.17 Gunzip
> >> operations
> >> MAIN.n_test_gunzip 0 0.00 Test gunzip
> >> operations
> >> MAIN.vsm_free 973840 . Free VSM
> space
> >> MAIN.vsm_used 83960768 . Used VSM
> space
> >> MAIN.vsm_cooling 0 . Cooling VSM
> >> space
> >> MAIN.vsm_overflow 0 . Overflow
> VSM
> >> space
> >> MAIN.vsm_overflowed 0 0.00 Overflowed
> >> VSM space
> >> MGT.uptime 3145114 12.86 Management
> >> process uptime
> >> MGT.child_start 3 0.00 Child
> process
> >> started
> >> MGT.child_exit 0 0.00 Child
> process
> >> normal exit
> >> MGT.child_stop 0 0.00 Child
> process
> >> unexpected exit
> >> MGT.child_died 2 0.00 Child
> process
> >> died (signal)
> >> MGT.child_dump 2 0.00 Child
> process
> >> core dumped
> >> MGT.child_panic 0 0.00 Child
> process
> >> panic
> >> MEMPOOL.busyobj.live 8 . In use
> >> MEMPOOL.busyobj.pool 11 . In Pool
> >> MEMPOOL.busyobj.sz_wanted 65536 . Size
> requested
> >> MEMPOOL.busyobj.sz_actual 65504 . Size
> allocated
> >> MEMPOOL.busyobj.allocs 2875324 11.76 Allocations
> >> MEMPOOL.busyobj.frees 2875316 11.76 Frees
> >> MEMPOOL.busyobj.recycle 2862130 11.71 Recycled
> from
> >> pool
> >> MEMPOOL.busyobj.timeout 108162 0.44 Timed out
> >> from pool
> >> MEMPOOL.busyobj.toosmall 0 0.00 Too small
> to
> >> recycle
> >> MEMPOOL.busyobj.surplus 0 0.00 Too many
> for
> >> pool
> >> MEMPOOL.busyobj.randry 13194 0.05 Pool ran
> dry
> >> MEMPOOL.req0.live 3 . In use
> >> MEMPOOL.req0.pool 14 . In Pool
> >> MEMPOOL.req0.sz_wanted 65536 . Size
> requested
> >> MEMPOOL.req0.sz_actual 65504 . Size
> allocated
> >> MEMPOOL.req0.allocs 7871304 32.20 Allocations
> >> MEMPOOL.req0.frees 7871301 32.20 Frees
> >> MEMPOOL.req0.recycle 7850975 32.11 Recycled
> from
> >> pool
> >> MEMPOOL.req0.timeout 82821 0.34 Timed out
> >> from pool
> >> MEMPOOL.req0.toosmall 0 0.00 Too small
> to
> >> recycle
> >> MEMPOOL.req0.surplus 2158 0.01 Too many
> for
> >> pool
> >> MEMPOOL.req0.randry 20329 0.08 Pool ran
> dry
> >> MEMPOOL.sess0.live 14 . In use
> >> MEMPOOL.sess0.pool 13 . In Pool
> >> MEMPOOL.sess0.sz_wanted 512 . Size
> requested
> >> MEMPOOL.sess0.sz_actual 480 . Size
> allocated
> >> MEMPOOL.sess0.allocs 5407625 22.12 Allocations
> >> MEMPOOL.sess0.frees 5407611 22.12 Frees
> >> MEMPOOL.sess0.recycle 5342462 21.85 Recycled
> from
> >> pool
> >> MEMPOOL.sess0.timeout 184908 0.76 Timed out
> >> from pool
> >> MEMPOOL.sess0.toosmall 0 0.00 Too small
> to
> >> recycle
> >> MEMPOOL.sess0.surplus 30850 0.13 Too many
> for
> >> pool
> >> MEMPOOL.sess0.randry 65163 0.27 Pool ran
> dry
> >> MEMPOOL.req1.live 4 . In use
> >> MEMPOOL.req1.pool 18 . In Pool
> >> MEMPOOL.req1.sz_wanted 65536 . Size
> requested
> >> MEMPOOL.req1.sz_actual 65504 . Size
> allocated
> >> MEMPOOL.req1.allocs 7870901 32.20 Allocations
> >> MEMPOOL.req1.frees 7870897 32.20 Frees
> >> MEMPOOL.req1.recycle 7850560 32.11 Recycled
> from
> >> pool
> >> MEMPOOL.req1.timeout 82989 0.34 Timed out
> >> from pool
> >> MEMPOOL.req1.toosmall 0 0.00 Too small
> to
> >> recycle
> >> MEMPOOL.req1.surplus 1794 0.01 Too many
> for
> >> pool
> >> MEMPOOL.req1.randry 20341 0.08 Pool ran
> dry
> >> MEMPOOL.sess1.live 17 . In use
> >> MEMPOOL.sess1.pool 18 . In Pool
> >> MEMPOOL.sess1.sz_wanted 512 . Size
> requested
> >> MEMPOOL.sess1.sz_actual 480 . Size
> allocated
> >> MEMPOOL.sess1.allocs 5406276 22.11 Allocations
> >> MEMPOOL.sess1.frees 5406259 22.11 Frees
> >> MEMPOOL.sess1.recycle 5340931 21.85 Recycled
> from
> >> pool
> >> MEMPOOL.sess1.timeout 184777 0.76 Timed out
> >> from pool
> >> MEMPOOL.sess1.toosmall 0 0.00 Too small
> to
> >> recycle
> >> MEMPOOL.sess1.surplus 30947 0.13 Too many
> for
> >> pool
> >> MEMPOOL.sess1.randry 65345 0.27 Pool ran
> dry
> >> SMA.s0.c_req 4309927 17.63 Allocator
> >> requests
> >> SMA.s0.c_fail 0 0.00 Allocator
> >> failures
> >> SMA.s0.c_bytes 127888920949 523122.98 Bytes
> >> allocated
> >> SMA.s0.c_freed 127714379518 522409.03 Bytes freed
> >> SMA.s0.g_alloc 3072 . Allocations
> >> outstanding
> >> SMA.s0.g_bytes 174541431 . Bytes
> >> outstanding
> >> SMA.s0.g_space 8415393161 . Bytes
> >> available
> >> SMA.Transient.c_req 3511659 14.36 Allocator
> >> requests
> >> SMA.Transient.c_fail 0 0.00 Allocator
> >> failures
> >> SMA.Transient.c_bytes 17678767500 72314.08 Bytes
> >> allocated
> >> SMA.Transient.c_freed 17678606604 72313.42 Bytes freed
> >> SMA.Transient.g_alloc 196 . Allocations
> >> outstanding
> >> SMA.Transient.g_bytes 160896 . Bytes
> >> outstanding
> >> SMA.Transient.g_space 0 . Bytes
> >> available
> >> VBE.conf-2018-11-06_142400.default.happy 0 . Happy
> >> health probes
> >> VBE.conf-2018-11-06_142400.default.bereq_hdrbytes 1601813566
> >> 6552.14 Request header bytes
> >> VBE.conf-2018-11-06_142400.default.bereq_bodybytes 197010427
> >> 805.86 Request body bytes
> >> VBE.conf-2018-11-06_142400.default.beresp_hdrbytes 1481249422
> >> 6058.97 Response header bytes
> >> VBE.conf-2018-11-06_142400.default.beresp_bodybytes 47877555284
> >> 195840.65 Response body bytes
> >> VBE.conf-2018-11-06_142400.default.pipe_hdrbytes 2718879
> >> 11.12 Pipe request header bytes
> >> VBE.conf-2018-11-06_142400.default.pipe_out 1409866
> >> 5.77 Piped bytes to backend
> >> VBE.conf-2018-11-06_142400.default.pipe_in 17882349
> >> 73.15 Piped bytes from backend
> >> VBE.conf-2018-11-06_142400.default.conn 8
> >> . Concurrent connections to backend
> >> VBE.conf-2018-11-06_142400.default.req 2875327
> >> 11.76 Backend requests sent
> >> LCK.backend.creat 3
> >> 0.00 Created locks
> >> LCK.backend.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.backend.locks 5750647
> >> 23.52 Lock Operations
> >> LCK.backend_tcp.creat 1
> >> 0.00 Created locks
> >> LCK.backend_tcp.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.backend_tcp.locks 9038321
> >> 36.97 Lock Operations
> >> LCK.ban.creat 1
> >> 0.00 Created locks
> >> LCK.ban.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.ban.locks 5084447
> >> 20.80 Lock Operations
> >> LCK.busyobj.creat 3213172
> >> 13.14 Created locks
> >> LCK.busyobj.destroy 3214904
> >> 13.15 Destroyed locks
> >> LCK.busyobj.locks 30427598
> >> 124.46 Lock Operations
> >> LCK.cli.creat 1
> >> 0.00 Created locks
> >> LCK.cli.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.cli.locks 81496
> >> 0.33 Lock Operations
> >> LCK.exp.creat 1
> >> 0.00 Created locks
> >> LCK.exp.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.exp.locks 13382080
> >> 54.74 Lock Operations
> >> LCK.hcb.creat 1
> >> 0.00 Created locks
> >> LCK.hcb.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.hcb.locks 696280
> >> 2.85 Lock Operations
> >> LCK.lru.creat 2
> >> 0.00 Created locks
> >> LCK.lru.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.lru.locks 8967041
> >> 36.68 Lock Operations
> >> LCK.mempool.creat 5
> >> 0.00 Created locks
> >> LCK.mempool.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.mempool.locks 60929423
> >> 249.23 Lock Operations
> >> LCK.objhdr.creat 349642
> >> 1.43 Created locks
> >> LCK.objhdr.destroy 346251
> >> 1.42 Destroyed locks
> >> LCK.objhdr.locks 75491395
> >> 308.79 Lock Operations
> >> LCK.pipestat.creat 1
> >> 0.00 Created locks
> >> LCK.pipestat.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.pipestat.locks 1724
> >> 0.01 Lock Operations
> >> LCK.sess.creat 10813321
> >> 44.23 Created locks
> >> LCK.sess.destroy 10813836
> >> 44.23 Destroyed locks
> >> LCK.sess.locks 16814084
> >> 68.78 Lock Operations
> >> LCK.vbe.creat 1
> >> 0.00 Created locks
> >> LCK.vbe.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.vbe.locks 81489
> >> 0.33 Lock Operations
> >> LCK.vcapace.creat 1
> >> 0.00 Created locks
> >> LCK.vcapace.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.vcapace.locks 0
> >> 0.00 Lock Operations
> >> LCK.vcl.creat 1
> >> 0.00 Created locks
> >> LCK.vcl.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.vcl.locks 5815522
> >> 23.79 Lock Operations
> >> LCK.vxid.creat 1
> >> 0.00 Created locks
> >> LCK.vxid.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.vxid.locks 2346
> >> 0.01 Lock Operations
> >> LCK.waiter.creat 2
> >> 0.00 Created locks
> >> LCK.waiter.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.waiter.locks 19733708
> >> 80.72 Lock Operations
> >> LCK.wq.creat 3
> >> 0.00 Created locks
> >> LCK.wq.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.wq.locks 56100309
> >> 229.48 Lock Operations
> >> LCK.wstat.creat 1
> >> 0.00 Created locks
> >> LCK.wstat.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.wstat.locks 21802570
> >> 89.18 Lock Operations
> >> LCK.sma.creat 2
> >> 0.00 Created locks
> >> LCK.sma.destroy 0
> >> 0.00 Destroyed locks
> >> LCK.sma.locks 15639885
> >> 63.97 Lock Operations
> >>
> >> Thanks !
> >>
> >> Marc Fournier
> >> _______________________________________________
> >> varnish-misc mailing list
> >> varnish-misc@varnish-cache.org
> >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
> >>
> > _______________________________________________
> > varnish-misc mailing list
> > varnish-misc@varnish-cache.org
> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>