Mailing List Archive

Traffic drop issue
Hello,

We encounter an issue on trafic increase event, here is the timeline:

- We have normally ~1000 req/s, a event cause the traffic to increase to ~1400 req/s with spike at ~2300.
- Varnish is creating new threads to the limit
- threads_limited counter strongly increase
- Varnish don't response to all requests, req/s decrease slowly to 0 (see attachment)
- A restart of Varnish solved the problem, varnish accept all requests.

Can you help us identify the problem? Do you need to adjust the configuration?
You can find a param.show in attachement.

# varnishd -V
varnishd (varnish-6.4.0 revision 13f137934ec1cf14af66baf7896311115ee35598)
Copyright (c) 2006 Verdens Gang AS
Copyright (c) 2006-2020 Varnish Software AS

We use hitch for TLS support:

# hitch -V
hitch 1.5.2

Best Regards,
--
Sébastien
Re: Traffic drop issue [ In reply to ]
Hi,

Do you have a load-balancer in front of Varnish? The decrease looks like
the connections are being drained

Cheers,

--
Guillaume Quintard

On Mon, Oct 26, 2020 at 8:12 AM Sébastien EISSLER <sebastien@sdv.fr> wrote:

> Hello,
>
> We encounter an issue on trafic increase event, here is the timeline:
>
> - We have normally ~1000 req/s, a event cause the traffic to increase to
> ~1400 req/s with spike at ~2300.
> - Varnish is creating new threads to the limit
> - threads_limited counter strongly increase
> - Varnish don't response to all requests, req/s decrease slowly to 0 (see
> attachment)
> - A restart of Varnish solved the problem, varnish accept all requests.
>
> Can you help us identify the problem? Do you need to adjust the
> configuration?
> You can find a param.show in attachement.
>
> # varnishd -V
> varnishd (varnish-6.4.0 revision 13f137934ec1cf14af66baf7896311115ee35598)
> Copyright (c) 2006 Verdens Gang AS
> Copyright (c) 2006-2020 Varnish Software AS
>
> We use hitch for TLS support:
>
> # hitch -V
> hitch 1.5.2
>
> Best Regards,
> --
> Sébastien
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: Traffic drop issue [ In reply to ]
Hello,

Yes, we have a load-balancer in front, we already investigate that way and don't dectect any error or configuration limitation.
The fact that threads_limited counter strongly increase and that restart of Varnish restore the traffic suggest an issue on varnish side.

In attachement a varnishstat result from 3 hours before the issue.
We notice a high value for MAIN.sess_closed_err counter.

Thanks for your help.

Regards

> Hi,
>
> Do you have a load-balancer in front of Varnish? The decrease looks like
> the connections are being drained
>
> Cheers,
>

--
Sébastien
Re: Traffic drop issue [ In reply to ]
On Thu, Oct 29, 2020 at 3:06 PM Sébastien EISSLER <sebastien@sdv.fr> wrote:
>
> Hello,
>
> Yes, we have a load-balancer in front, we already investigate that way and don't dectect any error or configuration limitation.
> The fact that threads_limited counter strongly increase and that restart of Varnish restore the traffic suggest an issue on varnish side.

Did you try to increase thread_pool_max?

> In attachement a varnishstat result from 3 hours before the issue.
> We notice a high value for MAIN.sess_closed_err counter.
>
> Thanks for your help.
>
> Regards
>
> > Hi,
> >
> > Do you have a load-balancer in front of Varnish? The decrease looks like
> > the connections are being drained
> >
> > Cheers,
> >
>
> --
> Sébastien
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Traffic drop issue [ In reply to ]
Hello,

> On Thu, Oct 29, 2020 at 3:06 PM Sébastien EISSLER <sebastien@sdv.fr> wrote:
> >
> > Hello,
> >
> > Yes, we have a load-balancer in front, we already investigate that way and don't dectect any error or configuration limitation.
> > The fact that threads_limited counter strongly increase and that restart of Varnish restore the traffic suggest an issue on varnish side.
>
> Did you try to increase thread_pool_max?
>

Yes, we increased thread_pool_max and thread_pool_min after that issue.
For the moment all work fine.

We think about adjusting other params concerning threads like thread_pool_reserve.
Did you have advise about that?

Regards
--
Sébastien
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Traffic drop issue [ In reply to ]
> > Did you try to increase thread_pool_max?
> >
>
> Yes, we increased thread_pool_max and thread_pool_min after that issue.
> For the moment all work fine.

FWIW, it's in the documentation of the threads_limited counter, which
you (and Guillaume) didn't seem to notice (or remember) before I
brought this up. Correct me if I'm wrong of course, but more
importantly please let me know how we could improve the documentation
if that was not enough.

> We think about adjusting other params concerning threads like thread_pool_reserve.
> Did you have advise about that?

If you already increased thread_pool_min, you mechanically increased
the size of the reserve (5% of thread_pool_min by default) so I'd
suggest you keep monitoring threads_limited and see whether you need
more workers for your workload.

Dridi
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Traffic drop issue [ In reply to ]
What intrigues me is why the number of requests decreased.
Hitting thread_pool_max should make the number of request plateau, not go
down.

If there's a loadbalancer that realizes requests are getting dropped, and
so takes traffic away, it makes sense, just wanted to make sure.
If that's the case, the loadbalancer will probably give you some info there.

On top of threas_limited, sess_dropped and sess_queued are probably good
counters to check.

--
Guillaume Quintard
Re: Traffic drop issue [ In reply to ]
On Mon, Nov 2, 2020 at 3:59 PM Guillaume Quintard
<guillaume@varnish-software.com> wrote:
>
> What intrigues me is why the number of requests decreased. Hitting thread_pool_max should make the number of request plateau, not go down.
>
> If there's a loadbalancer that realizes requests are getting dropped, and so takes traffic away, it makes sense, just wanted to make sure.
> If that's the case, the loadbalancer will probably give you some info there.

Likewise if too many backend fetches are triggered (low hit ratio?)
and pile up, they will get higher priority than client tasks.

> On top of threas_limited, sess_dropped and sess_queued are probably good counters to check.

I think this doesn't apply to h2 requests.
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc