Mailing List Archive

Strange issue with probes
Hi,
i have a strange issue where varnish suddenly stops sending probes thus
declaring a backend healthy or sick till a next restart and i'm unable to
determine why. Please note that my backend is able to receive my probes
(and actually receives it), and i'm able to get a response every time i go
with a curl -H "Host: healthcheck" 10.32.161.89/balance_me, so i'll
consider my backend ultimately "good" and "able to respond".

Thanks a lot for every hint!
Luca

This is my backend configuration:

probe backend_check {
.request = "GET /balance_me HTTP/1.1"
"Host: healthcheck"
"Connection: close";
.timeout = 1s;
.interval = 2s;
.window = 5;
.threshold = 2;
}
backend othaph {
.host = "10.32.161.89";
.port = "80";
.connect_timeout = 1s;
.first_byte_timeout = 20s;
.between_bytes_timeout = 20s;
.probe = backend_check;
}

This is my "varnishadm backend.list"
boot.othaph probe Healthy 3/5

This is the total log of 20 minutes of "varnishlog -g raw -i
Backend_health" (please note that above it shows 3/5 while i have only 2
probes sent, apparently)
0 Backend_health - boot.othaph Back healthy 4--X-RH 2 2 5 0.067021
0.033510 HTTP/1.1 200 OK
0 Backend_health - boot.othaph Still healthy 4--X-RH 3 2 5
0.015176 0.027399 HTTP/1.1 200 OK

And this is my "varnishadm backend.list -p"
Backend name Admin Probe
boot.othaph probe Healthy 3/5
Current states good: 3 threshold: 2 window: 5
Average response time of good probes: 0.027399
Oldest ================================================== Newest
--------------------------------------------------------------44 Good IPv4
--------------------------------------------------------------XX Good Xmit
--------------------------------------------------------------RR Good Recv
-------------------------------------------------------------HHH Happy
Re: Strange issue with probes [ In reply to ]
Hi! I have a problem with backend probes not beeing sent, that seems very
similar to what Luca Gervasi has reported(see below).
Luca: Did you ever figure out a fix?

I run varnish version 5.1.3-1~xenial on ubuntu/xenial with
kernel 4.4.0-109-generic.

The behaviour I see is that I start up varnish, varnish sends one backend
probe for each backend, bringing them up to Healthy status. After that, it
seems like there are no more backend probes sent.

The backend replies correctly when I use curl, and anyway there is not
indication that a probe fails. They are simply not sent in the first place,
as far as I can tell.

Here is an example of my backend configuration:
.host = "foo1";
.port = "8080";
.connect_timeout = 0.4s;
.first_byte_timeout = 12s;
.between_bytes_timeout = 1s;
.max_connections = 40;
.probe = {
.url = "/healthcheck";
.timeout = 1s;
.interval = 6s;
.window = 5;
.threshold = 3;
}
}

My backend.list looks like this:
boot.foo1 probe Healthy 4/5 Fri, 12 Jan 2018
08:50:38 GMT
boot.foo2 probe Healthy 4/5 Fri, 12 Jan 2018
08:50:38 GMT

and here is an extract from backend.list -p:
boot.foo1 probe Healthy 4/5
Current states good: 4 threshold: 3 window: 5
Average response time of good probes: 0.002105
Oldest ================================================== Newest
--------------------------------------------------------------44 Good IPv4
--------------------------------------------------------------XX Good Xmit
--------------------------------------------------------------RR Good Recv
------------------------------------------------------------HHHH Happy


And running "varnishlog -g raw -i Backend_health" gives me no results back.

I also tested the exact same config on 5.2,1~xenial, but I get the same
problem there.

I don't really understand what is going on here, or how I should proceed.
Any help would be greatly appreciated!

Best regards,
Håvard Futsæter


2017-10-19 8:44 GMT+02:00 Luca Gervasi <luca.gervasi@gmail.com>:

> Hi,
> i have a strange issue where varnish suddenly stops sending probes thus
> declaring a backend healthy or sick till a next restart and i'm unable to
> determine why. Please note that my backend is able to receive my probes
> (and actually receives it), and i'm able to get a response every time i go
> with a curl -H "Host: healthcheck" 10.32.161.89/balance_me, so i'll
> consider my backend ultimately "good" and "able to respond".
>
> Thanks a lot for every hint!
> Luca
>
> This is my backend configuration:
>
> probe backend_check {
> .request = "GET /balance_me HTTP/1.1"
> "Host: healthcheck"
> "Connection: close";
> .timeout = 1s;
> .interval = 2s;
> .window = 5;
> .threshold = 2;
> }
> backend othaph {
> .host = "10.32.161.89";
> .port = "80";
> .connect_timeout = 1s;
> .first_byte_timeout = 20s;
> .between_bytes_timeout = 20s;
> .probe = backend_check;
> }
>
> This is my "varnishadm backend.list"
> boot.othaph probe Healthy 3/5
>
> This is the total log of 20 minutes of "varnishlog -g raw -i
> Backend_health" (please note that above it shows 3/5 while i have only 2
> probes sent, apparently)
> 0 Backend_health - boot.othaph Back healthy 4--X-RH 2 2 5
> 0.067021 0.033510 HTTP/1.1 200 OK
> 0 Backend_health - boot.othaph Still healthy 4--X-RH 3 2 5
> 0.015176 0.027399 HTTP/1.1 200 OK
>
> And this is my "varnishadm backend.list -p"
> Backend name Admin Probe
> boot.othaph probe Healthy 3/5
> Current states good: 3 threshold: 2 window: 5
> Average response time of good probes: 0.027399
> Oldest ================================================== Newest
> --------------------------------------------------------------44 Good
> IPv4
> --------------------------------------------------------------XX Good
> Xmit
> --------------------------------------------------------------RR Good
> Recv
> -------------------------------------------------------------HHH Happy
>
>
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: Strange issue with probes [ In reply to ]
Hello Håvard, Luca,

On Fri, Jan 12, 2018 at 10:13 AM, Håvard Alsaker Futsæter
<havardf@met.no> wrote:
> Hi! I have a problem with backend probes not beeing sent, that seems very
> similar to what Luca Gervasi has reported(see below).
> Luca: Did you ever figure out a fix?

Guillaume brought this to my attention, but I haven't looked at it
yet. What's surprising is that I don't recall changes in this area
until _after_ the 5.2 release (so no changes for the whole 5.x
series).

> I run varnish version 5.1.3-1~xenial on ubuntu/xenial with kernel
> 4.4.0-109-generic.
>
> The behaviour I see is that I start up varnish, varnish sends one backend
> probe for each backend, bringing them up to Healthy status. After that, it
> seems like there are no more backend probes sent.
>
> The backend replies correctly when I use curl, and anyway there is not
> indication that a probe fails. They are simply not sent in the first place,
> as far as I can tell.
>
> Here is an example of my backend configuration:
> .host = "foo1";
> .port = "8080";
> .connect_timeout = 0.4s;
> .first_byte_timeout = 12s;
> .between_bytes_timeout = 1s;
> .max_connections = 40;
> .probe = {
> .url = "/healthcheck";
> .timeout = 1s;
> .interval = 6s;
> .window = 5;
> .threshold = 3;
> }
> }
>
> My backend.list looks like this:
> boot.foo1 probe Healthy 4/5 Fri, 12 Jan 2018
> 08:50:38 GMT
> boot.foo2 probe Healthy 4/5 Fri, 12 Jan 2018
> 08:50:38 GMT
>
> and here is an extract from backend.list -p:
> boot.foo1 probe Healthy 4/5
> Current states good: 4 threshold: 3 window: 5
> Average response time of good probes: 0.002105
> Oldest ================================================== Newest
> --------------------------------------------------------------44 Good IPv4
> --------------------------------------------------------------XX Good Xmit
> --------------------------------------------------------------RR Good Recv
> ------------------------------------------------------------HHHH Happy
>
>
> And running "varnishlog -g raw -i Backend_health" gives me no results back.
>
> I also tested the exact same config on 5.2,1~xenial, but I get the same
> problem there.
>
> I don't really understand what is going on here, or how I should proceed.
> Any help would be greatly appreciated!

The workaround is to override the probe's status with the
`backend.set_health` command if you can't rely on probes (or in
general if you wish to rely on external monitoring to change the
status of a backend).

If any of you two has a github account, please open an issue.
Otherwise let me know and I will open one myself.

Dridi
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Strange issue with probes [ In reply to ]
Hi Dridi!

2018-01-12 10:42 GMT+01:00 Dridi Boukelmoune <dridi@varni.sh>:

> Hello Håvard, Luca,
>
> On Fri, Jan 12, 2018 at 10:13 AM, Håvard Alsaker Futsæter
> <havardf@met.no> wrote:
> > Hi! I have a problem with backend probes not beeing sent, that seems very
> > similar to what Luca Gervasi has reported(see below).
> > Luca: Did you ever figure out a fix?
>
> Guillaume brought this to my attention, but I haven't looked at it
> yet. What's surprising is that I don't recall changes in this area
> until _after_ the 5.2 release (so no changes for the whole 5.x
> series).
>
> > I run varnish version 5.1.3-1~xenial on ubuntu/xenial with kernel
> > 4.4.0-109-generic.
> >
> > The behaviour I see is that I start up varnish, varnish sends one backend
> > probe for each backend, bringing them up to Healthy status. After that,
> it
> > seems like there are no more backend probes sent.
> >
> > The backend replies correctly when I use curl, and anyway there is not
> > indication that a probe fails. They are simply not sent in the first
> place,
> > as far as I can tell.
> >
> > Here is an example of my backend configuration:
> > .host = "foo1";
> > .port = "8080";
> > .connect_timeout = 0.4s;
> > .first_byte_timeout = 12s;
> > .between_bytes_timeout = 1s;
> > .max_connections = 40;
> > .probe = {
> > .url = "/healthcheck";
> > .timeout = 1s;
> > .interval = 6s;
> > .window = 5;
> > .threshold = 3;
> > }
> > }
> >
> > My backend.list looks like this:
> > boot.foo1 probe Healthy 4/5 Fri, 12 Jan 2018
> > 08:50:38 GMT
> > boot.foo2 probe Healthy 4/5 Fri, 12 Jan 2018
> > 08:50:38 GMT
> >
> > and here is an extract from backend.list -p:
> > boot.foo1 probe Healthy 4/5
> > Current states good: 4 threshold: 3 window: 5
> > Average response time of good probes: 0.002105
> > Oldest ================================================== Newest
> > --------------------------------------------------------------44 Good
> IPv4
> > --------------------------------------------------------------XX Good
> Xmit
> > --------------------------------------------------------------RR Good
> Recv
> > ------------------------------------------------------------HHHH Happy
> >
> >
> > And running "varnishlog -g raw -i Backend_health" gives me no results
> back.
> >
> > I also tested the exact same config on 5.2,1~xenial, but I get the same
> > problem there.
> >
> > I don't really understand what is going on here, or how I should proceed.
> > Any help would be greatly appreciated!
>
> The workaround is to override the probe's status with the
> `backend.set_health` command if you can't rely on probes (or in
> general if you wish to rely on external monitoring to change the
> status of a backend).
>
> If any of you two has a github account, please open an issue.
> Otherwise let me know and I will open one myself.
>
>
Thanks for the response! I have opened an issue about this now.

Best regards,
Håvard


> Dridi
>