Mailing List Archive

Varnish intermittently returns incomplete images
Our Varnish (test environment) intermittently returns incomplete images. So the binary content is not complete. When requesting the image from the backend directly (using curl), the complete image is returned every time (I tested 1000 times using a script).

This happens intermittently. Sometimes Varnish returns the complete image, sometimes half of it, sometimes 20% etc... The incomplete image is returned quickly, so I don't think there is a timeout involved (we have not configured any specific timeout in varnish).

I see nothing special in varnishlog when this happens. But I don't know how to troubleshoot this in a good way. Any suggestions?
Re: Varnish intermittently returns incomplete images [ In reply to ]
Hi,

Do you have objects that are sensibly smaller that your images in your
cache?

What you are describing sounds like LRU failure (check nuke_limit in
"varnishadm param.show"), basically, on a miss, varnish couldn't evict
enough objects and make room for the new object, so it had to truncate it
and throw it away.

If that's the issue, you can increase nuke_limit, or get a bigger cache, or
segregate small and large objects into different storages.

--
Guillaume Quintard


On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com> wrote:

> Our Varnish (test environment) intermittently returns incomplete images.
> So the binary content is not complete. When requesting the image from the
> backend directly (using curl), the complete image is returned every time (I
> tested 1000 times using a script).
>
> This happens intermittently. Sometimes Varnish returns the complete image,
> sometimes half of it, sometimes 20% etc... The incomplete image is returned
> quickly, so I don't think there is a timeout involved (we have not
> configured any specific timeout in varnish).
>
> I see nothing special in varnishlog when this happens. But I don't know
> how to troubleshoot this in a good way. Any suggestions?
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: Varnish intermittently returns incomplete images [ In reply to ]
Hi,

Well, sure there are some objects that are rather big (for a regular web site, up to maybe 50 MB), but most objects are maybe 10-100 kB. The last image I tried, that had intermittent problems, was about 700 kB.

Some numbers from varnishstat:

MAIN.uptime: 23+01:28:11
MAIN.n_lru_nuked: 887748
MAIN.n_lru_limited: 459
SMA.s0.c_bytes: 17.59G
SMA.s0.c_freed: 17.49G
SMA.s0.g_bytes: 99.34M
SMA.s0.g_space: 607.18K

"n_lru_nuked" seems high. Would you recommend a bigger cache in this case?

Below is the output from "varnishadm param.show". I'm suspecting that when we did the initial tweaking (actually only focusing on the vcl logic, not cache sizes) we glanced at this output when the server was recently started, and didn't have much traffic. Now the server has been running for a while, and the traffic has increased (still testing environment only though).

------
accept_filter -
acceptor_sleep_decay 0.9 (default)
acceptor_sleep_incr 0.000 [seconds] (default)
acceptor_sleep_max 0.050 [seconds] (default)
auto_restart on [bool] (default)
backend_idle_timeout 60.000 [seconds] (default)
backend_local_error_holddown 10.000 [seconds] (default)
backend_remote_error_holddown 0.250 [seconds] (default)
ban_cutoff 0 [bans] (default)
ban_dups on [bool] (default)
ban_lurker_age 60.000 [seconds] (default)
ban_lurker_batch 1000 (default)
ban_lurker_holdoff 0.010 [seconds] (default)
ban_lurker_sleep 0.010 [seconds] (default)
between_bytes_timeout 60.000 [seconds] (default)
cc_command exec gcc -g -O2 -fdebug-prefix-map=/build/varnish-ZKkrdt/varnish-6.0.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Werror -Wno-error=unused-result -pthread -fpic -shared -Wl,-x -o %o %s (default)
cli_limit 48k [bytes] (default)
cli_timeout 60.000 [seconds] (default)
clock_skew 10 [seconds] (default)
clock_step 1.000 [seconds] (default)
connect_timeout 3.500 [seconds] (default)
critbit_cooloff 180.000 [seconds] (default)
debug none (default)
default_grace 10.000 [seconds] (default)
default_keep 0.000 [seconds] (default)
default_ttl 120.000 [seconds] (default)
esi_iovs 10 [struct iovec] (default)
feature none (default)
fetch_chunksize 16k [bytes] (default)
fetch_maxchunksize 0.25G [bytes] (default)
first_byte_timeout 60.000 [seconds] (default)
gzip_buffer 32k [bytes] (default)
gzip_level 6 (default)
gzip_memlevel 8 (default)
h2_header_table_size 4k [bytes] (default)
h2_initial_window_size 65535b [bytes] (default)
h2_max_concurrent_streams 100 [streams] (default)
h2_max_frame_size 16k [bytes] (default)
h2_max_header_list_size 2147483647b [bytes] (default)
h2_rx_window_increment 1M [bytes] (default)
h2_rx_window_low_water 10M [bytes] (default)
http_gzip_support on [bool] (default)
http_max_hdr 64 [header lines] (default)
http_range_support on [bool] (default)
http_req_hdr_len 8k [bytes] (default)
http_req_size 32k [bytes] (default)
http_resp_hdr_len 8k [bytes] (default)
http_resp_size 32k [bytes] (default)
idle_send_timeout 60.000 [seconds] (default)
listen_depth 1024 [connections] (default)
lru_interval 2.000 [seconds] (default)
max_esi_depth 5 [levels] (default)
max_restarts 4 [restarts] (default)
max_retries 4 [retries] (default)
nuke_limit 50 [allocations] (default)
pcre_match_limit 10000 (default)
pcre_match_limit_recursion 20 (default)
ping_interval 3 [seconds] (default)
pipe_timeout 60.000 [seconds] (default)
pool_req 10,100,10 (default)
pool_sess 10,100,10 (default)
pool_vbo 10,100,10 (default)
prefer_ipv6 off [bool] (default)
rush_exponent 3 [requests per request] (default)
send_timeout 600.000 [seconds] (default)
shm_reclen 255b [bytes] (default)
shortlived 10.000 [seconds] (default)
sigsegv_handler on [bool] (default)
syslog_cli_traffic on [bool] (default)
tcp_fastopen off [bool] (default)
tcp_keepalive_intvl 75.000 [seconds] (default)
tcp_keepalive_probes 9 [probes] (default)
tcp_keepalive_time 7200.000 [seconds] (default)
thread_pool_add_delay 0.000 [seconds] (default)
thread_pool_destroy_delay 1.000 [seconds] (default)
thread_pool_fail_delay 0.200 [seconds] (default)
thread_pool_max 5000 [threads] (default)
thread_pool_min 100 [threads] (default)
thread_pool_reserve 0 [threads] (default)
thread_pool_stack 48k [bytes] (default)
thread_pool_timeout 300.000 [seconds] (default)
thread_pool_watchdog 60.000 [seconds] (default)
thread_pools 2 [pools] (default)
thread_queue_limit 20 (default)
thread_stats_rate 10 [requests] (default)
timeout_idle 5.000 [seconds] (default)
timeout_linger 0.050 [seconds] (default)
vcc_allow_inline_c off [bool] (default)
vcc_err_unref on [bool] (default)
vcc_unsafe_path on [bool] (default)
vcl_cooldown 600.000 [seconds] (default)
vcl_dir /etc/varnish:/usr/share/varnish/vcl (default)
vcl_path /etc/varnish:/usr/share/varnish/vcl (default)
vmod_dir /usr/lib/varnish/vmods (default)
vmod_path /usr/lib/varnish/vmods (default)
vsl_buffer 4k [bytes] (default)
vsl_mask -ObjProtocol,-ObjStatus,-ObjReason,-ObjHeader,-VCL_trace,-WorkThread,-Hash,-VfpAcct,-H2RxHdr,-H2RxBody,-H2TxHdr,-H2TxBody (default)
vsl_reclen 255b [bytes] (default)
vsl_space 80M [bytes] (default)
vsm_free_cooldown 60.000 [seconds] (default)
vsm_space 1M [bytes] (default)
workspace_backend 64k [bytes] (default)
workspace_client 64k [bytes] (default)
workspace_session 0.50k [bytes] (default)
workspace_thread 2k [bytes] (default)
------

________________________________
From: Guillaume Quintard <guillaume@varnish-software.com>
Sent: Friday, May 8, 2020 7:34 PM
To: Batanun B <batanun@hotmail.com>
Cc: varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
Subject: Re: Varnish intermittently returns incomplete images

Hi,

Do you have objects that are sensibly smaller that your images in your cache?

What you are describing sounds like LRU failure (check nuke_limit in "varnishadm param.show"), basically, on a miss, varnish couldn't evict enough objects and make room for the new object, so it had to truncate it and throw it away.

If that's the issue, you can increase nuke_limit, or get a bigger cache, or segregate small and large objects into different storages.

--
Guillaume Quintard


On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com<mailto:batanun@hotmail.com>> wrote:
Our Varnish (test environment) intermittently returns incomplete images. So the binary content is not complete. When requesting the image from the backend directly (using curl), the complete image is returned every time (I tested 1000 times using a script).

This happens intermittently. Sometimes Varnish returns the complete image, sometimes half of it, sometimes 20% etc... The incomplete image is returned quickly, so I don't think there is a timeout involved (we have not configured any specific timeout in varnish).

I see nothing special in varnishlog when this happens. But I don't know how to troubleshoot this in a good way. Any suggestions?
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Varnish intermittently returns incomplete images [ In reply to ]
also... could you explain this part for me? "so it had to truncate it and throw it away"
Why does it have to truncate it? Why not avoid caching it, and returning it as is, from the backend, untouched?
________________________________
From: Guillaume Quintard <guillaume@varnish-software.com>
Sent: Friday, May 8, 2020 7:34 PM
To: Batanun B <batanun@hotmail.com>
Cc: varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
Subject: Re: Varnish intermittently returns incomplete images

Hi,

Do you have objects that are sensibly smaller that your images in your cache?

What you are describing sounds like LRU failure (check nuke_limit in "varnishadm param.show"), basically, on a miss, varnish couldn't evict enough objects and make room for the new object, so it had to truncate it and throw it away.

If that's the issue, you can increase nuke_limit, or get a bigger cache, or segregate small and large objects into different storages.

--
Guillaume Quintard


On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com<mailto:batanun@hotmail.com>> wrote:
Our Varnish (test environment) intermittently returns incomplete images. So the binary content is not complete. When requesting the image from the backend directly (using curl), the complete image is returned every time (I tested 1000 times using a script).

This happens intermittently. Sometimes Varnish returns the complete image, sometimes half of it, sometimes 20% etc... The incomplete image is returned quickly, so I don't think there is a timeout involved (we have not configured any specific timeout in varnish).

I see nothing special in varnishlog when this happens. But I don't know how to troubleshoot this in a good way. Any suggestions?
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Varnish intermittently returns incomplete images [ In reply to ]
Good question. This is because by default varnish streams the response, so
it starts sending what it has, even though it's unsure it can actually
deliver. When the eviction strikes, it just aborts the transaction.

The problem with "just" passing the data to the user is that there may be
more than one and things get really complicated.

Getting a bigger cache would help, and segregating the storages (smaller
than 1MB, and bigger than 1MB for example) would too
--
Guillaume Quintard


On Fri, May 8, 2020 at 11:28 AM Batanun B <batanun@hotmail.com> wrote:

> also... could you explain this part for me? "so it had to truncate it and
> throw it away"
> Why does it have to truncate it? Why not avoid caching it, and returning
> it as is, from the backend, untouched?
> ------------------------------
> *From:* Guillaume Quintard <guillaume@varnish-software.com>
> *Sent:* Friday, May 8, 2020 7:34 PM
> *To:* Batanun B <batanun@hotmail.com>
> *Cc:* varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
> *Subject:* Re: Varnish intermittently returns incomplete images
>
> Hi,
>
> Do you have objects that are sensibly smaller that your images in your
> cache?
>
> What you are describing sounds like LRU failure (check nuke_limit in
> "varnishadm param.show"), basically, on a miss, varnish couldn't evict
> enough objects and make room for the new object, so it had to truncate it
> and throw it away.
>
> If that's the issue, you can increase nuke_limit, or get a bigger cache,
> or segregate small and large objects into different storages.
>
> --
> Guillaume Quintard
>
>
> On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com> wrote:
>
> Our Varnish (test environment) intermittently returns incomplete images.
> So the binary content is not complete. When requesting the image from the
> backend directly (using curl), the complete image is returned every time (I
> tested 1000 times using a script).
>
> This happens intermittently. Sometimes Varnish returns the complete image,
> sometimes half of it, sometimes 20% etc... The incomplete image is returned
> quickly, so I don't think there is a timeout involved (we have not
> configured any specific timeout in varnish).
>
> I see nothing special in varnishlog when this happens. But I don't know
> how to troubleshoot this in a good way. Any suggestions?
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: Varnish intermittently returns incomplete images [ In reply to ]
ok. Interesting. ????

99.99% of the time that this is happening (after adjusting the cache sizes), I would say that it would be only a single user requesting the image. So it would make sense it it was possible to configure Varnish so it could handle that scenario in a better way. What happens when multiple users request the same resource, and this nuke problem happens is less of a problem, and those times a broken could be acceptable (but preferably it would start serving each request separately, uncached, fetching from the backend each time).

I will increase the cache size, and look into splitting it into two storages. But, I'm guessing you mean that the small objects should be cached in-memory, and the larger ones on disk? It would make much more sense if it cached less "popular" objects on disk, and more "popular" objects in memory, and only considering the object size when the in-memory cache starts to get full. Is it possible to configure Varnish to handle that in a smart and dynamic way?
________________________________
From: Guillaume Quintard <guillaume@varnish-software.com>
Sent: Friday, May 8, 2020 8:33 PM
To: Batanun B <batanun@hotmail.com>
Cc: varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
Subject: Re: Varnish intermittently returns incomplete images

Good question. This is because by default varnish streams the response, so it starts sending what it has, even though it's unsure it can actually deliver. When the eviction strikes, it just aborts the transaction.

The problem with "just" passing the data to the user is that there may be more than one and things get really complicated.

Getting a bigger cache would help, and segregating the storages (smaller than 1MB, and bigger than 1MB for example) would too
--
Guillaume Quintard


On Fri, May 8, 2020 at 11:28 AM Batanun B <batanun@hotmail.com<mailto:batanun@hotmail.com>> wrote:
also... could you explain this part for me? "so it had to truncate it and throw it away"
Why does it have to truncate it? Why not avoid caching it, and returning it as is, from the backend, untouched?
________________________________
From: Guillaume Quintard <guillaume@varnish-software.com<mailto:guillaume@varnish-software.com>>
Sent: Friday, May 8, 2020 7:34 PM
To: Batanun B <batanun@hotmail.com<mailto:batanun@hotmail.com>>
Cc: varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org> <varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org>>
Subject: Re: Varnish intermittently returns incomplete images

Hi,

Do you have objects that are sensibly smaller that your images in your cache?

What you are describing sounds like LRU failure (check nuke_limit in "varnishadm param.show"), basically, on a miss, varnish couldn't evict enough objects and make room for the new object, so it had to truncate it and throw it away.

If that's the issue, you can increase nuke_limit, or get a bigger cache, or segregate small and large objects into different storages.

--
Guillaume Quintard


On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com<mailto:batanun@hotmail.com>> wrote:
Our Varnish (test environment) intermittently returns incomplete images. So the binary content is not complete. When requesting the image from the backend directly (using curl), the complete image is returned every time (I tested 1000 times using a script).

This happens intermittently. Sometimes Varnish returns the complete image, sometimes half of it, sometimes 20% etc... The incomplete image is returned quickly, so I don't think there is a timeout involved (we have not configured any specific timeout in varnish).

I see nothing special in varnishlog when this happens. But I don't know how to troubleshoot this in a good way. Any suggestions?
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org<mailto:varnish-misc@varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Varnish intermittently returns incomplete images [ In reply to ]
No no, with Varnish open-source you really want all your stuff in memory
anyway. What I meant is really to have two storages based on object size.

Imagine you have a 50MB object that needs to be stored, and your nuke_limit
is 50.

If you only have one store, then you could easily evict 50 100KB objects,
forcing you to fail the transaction.

But if you have a pool that is dedicated to bigger objects, you know that
each object you evict from it is at least 1MB big, so you cannot fail the
transaction due to nuke_limit
--
Guillaume Quintard


On Fri, May 8, 2020 at 11:50 AM Batanun B <batanun@hotmail.com> wrote:

> ok. Interesting. ????
>
> 99.99% of the time that this is happening (after adjusting the cache
> sizes), I would say that it would be only a single user requesting the
> image. So it would make sense it it was possible to configure Varnish so it
> could handle that scenario in a better way. What happens when multiple
> users request the same resource, and this nuke problem happens is less of a
> problem, and those times a broken could be acceptable (but preferably it
> would start serving each request separately, uncached, fetching from the
> backend each time).
>
> I will increase the cache size, and look into splitting it into two
> storages. But, I'm guessing you mean that the small objects should be
> cached in-memory, and the larger ones on disk? It would make much more
> sense if it cached less "popular" objects on disk, and more "popular"
> objects in memory, and only considering the object size when the in-memory
> cache starts to get full. Is it possible to configure Varnish to handle
> that in a smart and dynamic way?
> ------------------------------
> *From:* Guillaume Quintard <guillaume@varnish-software.com>
> *Sent:* Friday, May 8, 2020 8:33 PM
> *To:* Batanun B <batanun@hotmail.com>
> *Cc:* varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
> *Subject:* Re: Varnish intermittently returns incomplete images
>
> Good question. This is because by default varnish streams the response, so
> it starts sending what it has, even though it's unsure it can actually
> deliver. When the eviction strikes, it just aborts the transaction.
>
> The problem with "just" passing the data to the user is that there may be
> more than one and things get really complicated.
>
> Getting a bigger cache would help, and segregating the storages (smaller
> than 1MB, and bigger than 1MB for example) would too
> --
> Guillaume Quintard
>
>
> On Fri, May 8, 2020 at 11:28 AM Batanun B <batanun@hotmail.com> wrote:
>
> also... could you explain this part for me? "so it had to truncate it and
> throw it away"
> Why does it have to truncate it? Why not avoid caching it, and returning
> it as is, from the backend, untouched?
> ------------------------------
> *From:* Guillaume Quintard <guillaume@varnish-software.com>
> *Sent:* Friday, May 8, 2020 7:34 PM
> *To:* Batanun B <batanun@hotmail.com>
> *Cc:* varnish-misc@varnish-cache.org <varnish-misc@varnish-cache.org>
> *Subject:* Re: Varnish intermittently returns incomplete images
>
> Hi,
>
> Do you have objects that are sensibly smaller that your images in your
> cache?
>
> What you are describing sounds like LRU failure (check nuke_limit in
> "varnishadm param.show"), basically, on a miss, varnish couldn't evict
> enough objects and make room for the new object, so it had to truncate it
> and throw it away.
>
> If that's the issue, you can increase nuke_limit, or get a bigger cache,
> or segregate small and large objects into different storages.
>
> --
> Guillaume Quintard
>
>
> On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun@hotmail.com> wrote:
>
> Our Varnish (test environment) intermittently returns incomplete images.
> So the binary content is not complete. When requesting the image from the
> backend directly (using curl), the complete image is returned every time (I
> tested 1000 times using a script).
>
> This happens intermittently. Sometimes Varnish returns the complete image,
> sometimes half of it, sometimes 20% etc... The incomplete image is returned
> quickly, so I don't think there is a timeout involved (we have not
> configured any specific timeout in varnish).
>
> I see nothing special in varnishlog when this happens. But I don't know
> how to troubleshoot this in a good way. Any suggestions?
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>