Mailing List Archive

varnish eats all RAM
Hi all,

we have a weird problem with one of our varnish instances running on debian
buster. The problem is, that varnish consumes all RAM over time and does not
free it again. This leads to service unavailability, until varnish gets either
killed by the oom-killer or restarted via some monitoring service (which we
set up to avoid errors). The system runs on an Intel(R) Xeon(R) CPU
E5335 and has 16GB RAM.

We initially experienced this problem with varnish 6.1 (default debian version),
but we upgraded to varnish 6.6 using the official varnish repos, and this
behaviour persists.

I attached a graph of the RAM usage, in which you can see the continuous
allocation of RAM until the process gets restarted.

Has anyone an idea how we could examine the RAM usage further?

Used varnish modules:
- directors
- std
- digest
- cookie

Used varnish options:
-p listen_depth=4096 \
-p thread_pool_min=200 \
-p thread_pool_max=300 \
-p workspace_backend=64k \
-p workspace_client=128k \
-p nuke_limit=1000 \
-p thread_pools=2 \
-s malloc,3G

Greetings,
Marco Dickert
--
Mit freundlichen Gr??en
Marco Dickert

Administration und Technik
evolver services GmbH

Fon +49 / (0)3 71 / 4 00 03 78 24
Fax +49 / (0)3 71 / 4 00 03 79

E-Mail marco.dickert@evolver.de
Web https://www.evolver.de

Sitz der Gesellschaft: Chemnitz
Handelsregister: Amtsgericht Chemnitz, HRB 22649
Gesch?ftsf?hrer: Torsten Gramann und Mathias M?ckel
Re: varnish eats all RAM [ In reply to ]
On 4/29/21 17:21, Marco Dickert - evolver group wrote:
>
> we have a weird problem with one of our varnish instances running on debian
> buster. The problem is, that varnish consumes all RAM over time and does not
> free it again.

Just a hunch, but this is reminiscent of problems that I've seen caused
by transparent huge pages, which are enabled by default in recent Linux
versions.

Why THP can be a problem with Varnish & jemalloc takes some explaining,
but maybe first check if that could be your problem.

On Linux you can look for the value of AnonHugePages in /proc/meminfo,
value in KB. You can also look for rss_huge in
/sys/fs/cgroup/memory/memory.stat, where the value is the number of huge
pages. So you'd have to multiply the latter value by the size of a
hugepage, which appears as Hugepagesize in /proc/meminfo (usually it's
2MB on Linux).

If those values are very large and seem to account for most of your high
RAM consumption, considerably higher than you would have expected for
Varnish, then THP may be your problem.

If you have jemalloc since version 4.5, you can turn off THP
specifically for Varnish with the configuration thp:never for jemalloc,
either in /etc/jemalloc.conf, or by passing it in the environment
variable MALLOC_CONF to the varnishd invocation -- set
MALLOC_CONF=thp:never and make sure Varnish inherits that in its
environment.

thp:never is not supported by jemalloc 3.6.0, which is still very widely
deployed on many systems (unfortunately, since jemalloc is up to version
5.2.1 now).

You can also turn off THP system-wide (for all processes):

$ echo never > /sys/kernel/mm/transparent_hugepage/enabled

That might be your only option if the jemalloc version is too old.

Either way, then try it and see if your RAM issues improve.


HTH,
Geoff
--
** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstra?e 32
22301 Hamburg

Tel +49 40 2880 5731
Mob +49 176 636 90917
Fax +49 40 42949753

http://uplex.de
Re: varnish eats all RAM [ In reply to ]
Hi Geoff,

thanks for your answer!

On 2021-04-29 18:00:39, Geoff Simmons wrote:
> You can also turn off THP system-wide (for all processes):
> $ echo never > /sys/kernel/mm/transparent_hugepage/enabled
> That might be your only option if the jemalloc version is too old.

We diabled huge pages in the kernel, but this didn't solve the problem, the RAM
consumption was unaffected.

However, we found that the "transient storage" of varnish may be part of our
problem. At least, we could mitigate this behaviour by limiting the transient
storage in the start parameters (last option):

-------
DAEMON_OPTS="-a :6081 \
-T :6082 \
-f /etc/varnish/default.vcl \
-p ping_interval=6 -p cli_timeout=10 -p pipe_timeout=600 \
-p listen_depth=4096 -p thread_pool_min=200 -p thread_pool_max=500 \
-p workspace_client=128k -p nuke_limit=1000 \
-S /etc/varnish/secret \
-s malloc,6G \
-s Transient=malloc,3G"
-------

Now varnish uses less RAM, and the varnishstats confirm that our limits should
work:

-------
SMA.s0.g_bytes 6.00G -107.61K . 6.00G 6.00G 6.00G
SMA.s0.g_space 132.70K 107.61K . 136.67K 137.35K 137.35K
SMA.Transient.g_bytes 1.55G 1022.19 . 1.55G 1.55G 1.55G
SMA.Transient.g_space 1.45G -1022.19 . 1.45G 1.45G 1.45G
-------

However, varnish, in total, uses up to 12GB RAM instead of only 6GB (cache) +
3GB (transient). I tried to find a value in the varnishstat output which might
indicate how this additional RAM is used, but didn't find anything useful yet.

So two questions:

1) What might cause varnish to consume considerably more RAM than (cache + transient
storage)?

2) What objects exactly are stored in the transient storage? The documentation
mentions "shortlived" objects [1] (the "shortlived" parameter is 200 in our
varnish, which seems to be the varnish debian package default, since we
didn't set this explicitly), but I am not sure if that is limited to
cacheable objects or or not. Also I don't know how to determine which
requests lead to excessive usage of the transient storage.

Thanks in advance for any further input.

Cheers,
Marco

[1] https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage

--
Mit freundlichen Gr??en
Marco Dickert

Administration und Technik
evolver services GmbH

Fon +49 / (0)3 71 / 4 00 03 78 24
Fax +49 / (0)3 71 / 4 00 03 79

E-Mail marco.dickert@evolver.de
Web https://www.evolver.de

Sitz der Gesellschaft: Chemnitz
Handelsregister: Amtsgericht Chemnitz, HRB 22649
Gesch?ftsf?hrer: Torsten Gramann und Mathias M?ckel
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: varnish eats all RAM [ In reply to ]
This is a known and unfortunate issue with the latest versions of jemalloc
and certain allocation patterns. You need to downgrade to jemalloc 3.6.

https://github.com/varnishcache/varnish-cache/issues/3511#issuecomment-771592238

--
Reza Naghibi
VP of Technology
Varnish Software


On Wed, May 5, 2021 at 5:35 AM Marco Dickert - evolver group <
marco.dickert@evolver.de> wrote:

> Hi Geoff,
>
> thanks for your answer!
>
> On 2021-04-29 18:00:39, Geoff Simmons wrote:
> > You can also turn off THP system-wide (for all processes):
> > $ echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > That might be your only option if the jemalloc version is too old.
>
> We diabled huge pages in the kernel, but this didn't solve the problem,
> the RAM
> consumption was unaffected.
>
> However, we found that the "transient storage" of varnish may be part of
> our
> problem. At least, we could mitigate this behaviour by limiting the
> transient
> storage in the start parameters (last option):
>
> -------
> DAEMON_OPTS="-a :6081 \
> -T :6082 \
> -f /etc/varnish/default.vcl \
> -p ping_interval=6 -p cli_timeout=10 -p pipe_timeout=600 \
> -p listen_depth=4096 -p thread_pool_min=200 -p
> thread_pool_max=500 \
> -p workspace_client=128k -p nuke_limit=1000 \
> -S /etc/varnish/secret \
> -s malloc,6G \
> -s Transient=malloc,3G"
> -------
>
> Now varnish uses less RAM, and the varnishstats confirm that our limits
> should
> work:
>
> -------
> SMA.s0.g_bytes 6.00G -107.61K . 6.00G 6.00G
> 6.00G
> SMA.s0.g_space 132.70K 107.61K . 136.67K 137.35K
> 137.35K
> SMA.Transient.g_bytes 1.55G 1022.19 . 1.55G 1.55G
> 1.55G
> SMA.Transient.g_space 1.45G -1022.19 . 1.45G 1.45G
> 1.45G
> -------
>
> However, varnish, in total, uses up to 12GB RAM instead of only 6GB
> (cache) +
> 3GB (transient). I tried to find a value in the varnishstat output which
> might
> indicate how this additional RAM is used, but didn't find anything useful
> yet.
>
> So two questions:
>
> 1) What might cause varnish to consume considerably more RAM than (cache +
> transient
> storage)?
>
> 2) What objects exactly are stored in the transient storage? The
> documentation
> mentions "shortlived" objects [1] (the "shortlived" parameter is 200 in
> our
> varnish, which seems to be the varnish debian package default, since we
> didn't set this explicitly), but I am not sure if that is limited to
> cacheable objects or or not. Also I don't know how to determine which
> requests lead to excessive usage of the transient storage.
>
> Thanks in advance for any further input.
>
> Cheers,
> Marco
>
> [1]
> https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage
>
> --
> Mit freundlichen Grüßen
> Marco Dickert
>
> Administration und Technik
> evolver services GmbH
>
> Fon +49 / (0)3 71 / 4 00 03 78 24
> Fax +49 / (0)3 71 / 4 00 03 79
>
> E-Mail marco.dickert@evolver.de
> Web https://www.evolver.de
>
> Sitz der Gesellschaft: Chemnitz
> Handelsregister: Amtsgericht Chemnitz, HRB 22649
> Geschäftsführer: Torsten Gramann und Mathias Möckel
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>