Mailing List Archive

Understanding 503s
Hi,

I need some help understanding why the below 503 is happening.

I am logging 503s to a separate file and then querying as below.

varnishlog -q "BerespStatus eq 503" -w /whatever/file
varnishlog -r /whatever/file

What's going wrong here ?
What should I be looking at ?

Please let me know.

* << BeReq >> 45926757
- Begin bereq 45926756 pass
- Timestamp Start: 1618461576.793111 0.000000 0.000000
- BereqMethod POST
- BereqURL /index.php?&u=85&tr=aa
- BereqProtocol HTTP/1.1
- BereqHeader X-Forwarded-Proto: https
- BereqHeader X-Forwarded-Port: 443
- BereqHeader Host: graph.com
- BereqHeader X-Amzn-Trace-Id: Root=1-xxx
- BereqHeader Content-Length: 793
- BereqHeader sec-ch-ua: "Google Chrome";v="89", "Chromium";v="89",
";Not A Brand";v="99"
- BereqHeader accept: application/json, text/javascript, */*; q=0.01
- BereqHeader sec-ch-ua-mobile: ?0
- BereqHeader user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36
- BereqHeader content-type: application/x-www-form-urlencoded;
charset=UTF-8
- BereqHeader origin: https://mandy..com
- BereqHeader sec-fetch-site: same-site
- BereqHeader sec-fetch-mode: cors
- BereqHeader sec-fetch-dest: empty
- BereqHeader referer: https://mandy.com/
- BereqHeader accept-encoding: gzip, deflate, br
- BereqHeader accept-language: en-US,en;q=0.9
- BereqHeader X-Forwarded-For: 103.67.157.20, 10.0.0.170
- BereqHeader browser: other
- BereqHeader serverIp: 10.0.3.237
- BereqHeader serverId: abc01
- BereqHeader X-Varnish: 45926757
- VCL_call BACKEND_FETCH
- VCL_return fetch
- BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
127.0.0.1 56176
- BackendStart 127.0.0.1 8080
- Timestamp Bereq: 1618461577.074387 0.281276 0.281276
- FetchError http first read error: EOF
- BackendClose 32 reload_2021-04-13T130756.default
- Timestamp Beresp: 1618461577.074430 0.281319 0.000043
- Timestamp Error: 1618461577.074434 0.281323 0.000004
- BerespProtocol HTTP/1.1
- BerespStatus 503
- BerespReason Service Unavailable
- BerespReason Backend fetch failed
- BerespHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
- BerespHeader Server: Varnish
- VCL_call BACKEND_ERROR
- BerespHeader Content-Type: text/html; charset=utf-8
- BerespHeader Retry-After: 5
- VCL_return deliver
- Storage malloc Transient
- ObjProtocol HTTP/1.1
- ObjStatus 503
- ObjReason Backend fetch failed
- ObjHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
- ObjHeader Server: Varnish
- ObjHeader Content-Type: text/html; charset=utf-8
- ObjHeader Retry-After: 5
- Length 1288
- BereqAcct 948 793 1741 0 0 0
- End
Re: Understanding 503s [ In reply to ]
I have that defined as 2 minutes.
backend default {
.host = "127.0.0.1";
.port = "8080";
.first_byte_timeout = 120s;
}

That's why this error is puzzling.

Any other timeouts ( connect ? ) etc to look at ?

Also, in the above dump, how much time did it take ?

To me it looks like it was closed within a second ?


- BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
127.0.0.1 56176
- BackendStart 127.0.0.1 8080
- Timestamp Bereq: 1618461577.074387 0.281276 0.281276
- FetchError http first read error: EOF
- BackendClose 32 reload_2021-04-13T130756.default
- Timestamp Beresp: 1618461577.074430 0.281319 0.000043
- Timestamp Error: 1618461577.074434 0.281323 0.000004

On Thu, 15 Apr 2021 at 11:42, Frands Bjerring Hansen
<frands.hansen@team.blue> wrote:

> Look at the fetch error: http first read error: EOF
>
>
>
> Perhaps the backend closed the connection before sending any data or the
> first_byte_timeout defined for the backend has been reached. The default is
> 60 seconds.
>
>
>
>
>
>
>
> */ Frands Bjerring Hansen*
> Head of Technology, Linux
> Office: +45 70 40 00 00
> frands.hansen@team.blue
>
> *Operations - Linux*
> team.blue Denmark A/S
> Højvangen 4
> 8660 Skanderborg
> Denmark
> CVR: 29412006
>
>
>
>
> On 15/04/2021, 08.08, "varnish-misc" <varnish-misc-bounces+fbh=
> zitcom.dk@varnish-cache.org> wrote:
>
> Hi,
>
> I need some help understanding why the below 503 is happening.
>
>
>
> I am logging 503s to a separate file and then querying as below.
>
>
>
> varnishlog -q "BerespStatus eq 503" -w /whatever/file
>
> varnishlog -r /whatever/file
>
>
>
> What's going wrong here ?
>
> What should I be looking at ?
>
>
>
> Please let me know.
>
>
>
> * << BeReq >> 45926757
>
> - Begin bereq 45926756 pass
>
> - Timestamp Start: 1618461576.793111 0.000000 0.000000
>
> - BereqMethod POST
>
> - BereqURL /index.php?&u=85&tr=aa
>
> - BereqProtocol HTTP/1.1
>
> - BereqHeader X-Forwarded-Proto: https
>
> - BereqHeader X-Forwarded-Port: 443
>
> - BereqHeader Host: graph.com <http://graph.com>
>
> - BereqHeader X-Amzn-Trace-Id: Root=1-xxx
>
> - BereqHeader Content-Length: 793
>
> - BereqHeader sec-ch-ua: "Google Chrome";v="89", "Chromium";v="89",
> ";Not A Brand";v="99"
>
> - BereqHeader accept: application/json, text/javascript, */*; q=0.01
>
> - BereqHeader sec-ch-ua-mobile: ?0
>
> - BereqHeader user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36
>
> - BereqHeader content-type: application/x-www-form-urlencoded;
> charset=UTF-8
>
> - BereqHeader origin: https://mandy..com
>
> - BereqHeader sec-fetch-site: same-site
>
> - BereqHeader sec-fetch-mode: cors
>
> - BereqHeader sec-fetch-dest: empty
>
> - BereqHeader referer: https://mandy.com/
>
> - BereqHeader accept-encoding: gzip, deflate, br
>
> - BereqHeader accept-language: en-US,en;q=0.9
>
> - BereqHeader X-Forwarded-For: 103.67.157.20, 10.0.0.170
>
> - BereqHeader browser: other
>
> - BereqHeader serverIp: 10.0.3.237
>
> - BereqHeader serverId: abc01
>
> - BereqHeader X-Varnish: 45926757
>
> - VCL_call BACKEND_FETCH
>
> - VCL_return fetch
>
> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
> 127.0.0.1 56176
>
> - BackendStart 127.0.0.1 8080
>
> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
>
> - FetchError http first read error: EOF
>
> - BackendClose 32 reload_2021-04-13T130756.default
>
> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
>
> - Timestamp Error: 1618461577.074434 0.281323 0.000004
>
> - BerespProtocol HTTP/1.1
>
> - BerespStatus 503
>
> - BerespReason Service Unavailable
>
> - BerespReason Backend fetch failed
>
> - BerespHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>
> - BerespHeader Server: Varnish
>
> - VCL_call BACKEND_ERROR
>
> - BerespHeader Content-Type: text/html; charset=utf-8
>
> - BerespHeader Retry-After: 5
>
> - VCL_return deliver
>
> - Storage malloc Transient
>
> - ObjProtocol HTTP/1.1
>
> - ObjStatus 503
>
> - ObjReason Backend fetch failed
>
> - ObjHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>
> - ObjHeader Server: Varnish
>
> - ObjHeader Content-Type: text/html; charset=utf-8
>
> - ObjHeader Retry-After: 5
>
> - Length 1288
>
> - BereqAcct 948 793 1741 0 0 0
>
> - End
>
>
>
Re: Understanding 503s [ In reply to ]
Also, backend is apache 2.4
and running php-fpm.

On Thu, 15 Apr 2021 at 11:52, Maninder Singh <mandys@gmail.com> wrote:

> I have that defined as 2 minutes.
> backend default {
> .host = "127.0.0.1";
> .port = "8080";
> .first_byte_timeout = 120s;
> }
>
> That's why this error is puzzling.
>
> Any other timeouts ( connect ? ) etc to look at ?
>
> Also, in the above dump, how much time did it take ?
>
> To me it looks like it was closed within a second ?
>
>
> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
> 127.0.0.1 56176
> - BackendStart 127.0.0.1 8080
> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
> - FetchError http first read error: EOF
> - BackendClose 32 reload_2021-04-13T130756.default
> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
> - Timestamp Error: 1618461577.074434 0.281323 0.000004
>
> On Thu, 15 Apr 2021 at 11:42, Frands Bjerring Hansen
> <frands.hansen@team.blue> wrote:
>
>> Look at the fetch error: http first read error: EOF
>>
>>
>>
>> Perhaps the backend closed the connection before sending any data or the
>> first_byte_timeout defined for the backend has been reached. The default is
>> 60 seconds.
>>
>>
>>
>>
>>
>>
>>
>> */ Frands Bjerring Hansen*
>> Head of Technology, Linux
>> Office: +45 70 40 00 00
>> frands.hansen@team.blue
>>
>> *Operations - Linux*
>> team.blue Denmark A/S
>> Højvangen 4
>> 8660 Skanderborg
>> Denmark
>> CVR: 29412006
>>
>>
>>
>>
>> On 15/04/2021, 08.08, "varnish-misc" <varnish-misc-bounces+fbh=
>> zitcom.dk@varnish-cache.org> wrote:
>>
>> Hi,
>>
>> I need some help understanding why the below 503 is happening.
>>
>>
>>
>> I am logging 503s to a separate file and then querying as below.
>>
>>
>>
>> varnishlog -q "BerespStatus eq 503" -w /whatever/file
>>
>> varnishlog -r /whatever/file
>>
>>
>>
>> What's going wrong here ?
>>
>> What should I be looking at ?
>>
>>
>>
>> Please let me know.
>>
>>
>>
>> * << BeReq >> 45926757
>>
>> - Begin bereq 45926756 pass
>>
>> - Timestamp Start: 1618461576.793111 0.000000 0.000000
>>
>> - BereqMethod POST
>>
>> - BereqURL /index.php?&u=85&tr=aa
>>
>> - BereqProtocol HTTP/1.1
>>
>> - BereqHeader X-Forwarded-Proto: https
>>
>> - BereqHeader X-Forwarded-Port: 443
>>
>> - BereqHeader Host: graph.com <http://graph.com>
>>
>> - BereqHeader X-Amzn-Trace-Id: Root=1-xxx
>>
>> - BereqHeader Content-Length: 793
>>
>> - BereqHeader sec-ch-ua: "Google Chrome";v="89", "Chromium";v="89",
>> ";Not A Brand";v="99"
>>
>> - BereqHeader accept: application/json, text/javascript, */*; q=0.01
>>
>> - BereqHeader sec-ch-ua-mobile: ?0
>>
>> - BereqHeader user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
>> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36
>>
>> - BereqHeader content-type: application/x-www-form-urlencoded;
>> charset=UTF-8
>>
>> - BereqHeader origin: https://mandy..com
>>
>> - BereqHeader sec-fetch-site: same-site
>>
>> - BereqHeader sec-fetch-mode: cors
>>
>> - BereqHeader sec-fetch-dest: empty
>>
>> - BereqHeader referer: https://mandy.com/
>>
>> - BereqHeader accept-encoding: gzip, deflate, br
>>
>> - BereqHeader accept-language: en-US,en;q=0.9
>>
>> - BereqHeader X-Forwarded-For: 103.67.157.20, 10.0.0.170
>>
>> - BereqHeader browser: other
>>
>> - BereqHeader serverIp: 10.0.3.237
>>
>> - BereqHeader serverId: abc01
>>
>> - BereqHeader X-Varnish: 45926757
>>
>> - VCL_call BACKEND_FETCH
>>
>> - VCL_return fetch
>>
>> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
>> 127.0.0.1 56176
>>
>> - BackendStart 127.0.0.1 8080
>>
>> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
>>
>> - FetchError http first read error: EOF
>>
>> - BackendClose 32 reload_2021-04-13T130756.default
>>
>> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
>>
>> - Timestamp Error: 1618461577.074434 0.281323 0.000004
>>
>> - BerespProtocol HTTP/1.1
>>
>> - BerespStatus 503
>>
>> - BerespReason Service Unavailable
>>
>> - BerespReason Backend fetch failed
>>
>> - BerespHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>>
>> - BerespHeader Server: Varnish
>>
>> - VCL_call BACKEND_ERROR
>>
>> - BerespHeader Content-Type: text/html; charset=utf-8
>>
>> - BerespHeader Retry-After: 5
>>
>> - VCL_return deliver
>>
>> - Storage malloc Transient
>>
>> - ObjProtocol HTTP/1.1
>>
>> - ObjStatus 503
>>
>> - ObjReason Backend fetch failed
>>
>> - ObjHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>>
>> - ObjHeader Server: Varnish
>>
>> - ObjHeader Content-Type: text/html; charset=utf-8
>>
>> - ObjHeader Retry-After: 5
>>
>> - Length 1288
>>
>> - BereqAcct 948 793 1741 0 0 0
>>
>> - End
>>
>>
>>
>
Re: Understanding 503s [ In reply to ]
Hello,

For global timeouts:

https://varnish-cache.org/docs/trunk/reference/varnishd.html#run-time-parameters

They contain "timeout" in the name.

For VCL-defined timeouts:

https://varnish-cache.org/docs/trunk/reference/vcl-backend.html#timeout-attributes
https://varnish-cache.org/docs/trunk/reference/vcl-probe.html#attribute-timeout
https://varnish-cache.org/docs/trunk/reference/vcl-var.html

If the problem was a timeout, you would see this in the log:

FetchError first byte timeout

If you are using apache httpd there´s a good chance that incoming
connections are closed after 5s of inactivity by default. But when
varnish pools backend connections, the default backend_idle_timeout is
60s so you may end up reusing a connection that was closed by your
httpd server. You should increase your keep-alive timeout to meet
varnish´s expectations.

https://httpd.apache.org/docs/2.4/mod/core.html#keepalivetimeout

Please note that varnish has a similar timeout_idle that defaults to
5s too. So stacking varnish servers can lead to the same problem if
you rely on the defaults.

Dridi

On Thu, Apr 15, 2021 at 6:27 AM Maninder Singh <mandys@gmail.com> wrote:
>
> Also, backend is apache 2.4
> and running php-fpm.
>
> On Thu, 15 Apr 2021 at 11:52, Maninder Singh <mandys@gmail.com> wrote:
>>
>> I have that defined as 2 minutes.
>> backend default {
>> .host = "127.0.0.1";
>> .port = "8080";
>> .first_byte_timeout = 120s;
>> }
>>
>> That's why this error is puzzling.
>>
>> Any other timeouts ( connect ? ) etc to look at ?
>>
>> Also, in the above dump, how much time did it take ?
>>
>> To me it looks like it was closed within a second ?
>>
>>
>> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080 127.0.0.1 56176
>> - BackendStart 127.0.0.1 8080
>> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
>> - FetchError http first read error: EOF
>> - BackendClose 32 reload_2021-04-13T130756.default
>> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
>> - Timestamp Error: 1618461577.074434 0.281323 0.000004
>>
>> On Thu, 15 Apr 2021 at 11:42, Frands Bjerring Hansen <frands.hansen@team.blue> wrote:
>>>
>>> Look at the fetch error: http first read error: EOF
>>>
>>>
>>>
>>> Perhaps the backend closed the connection before sending any data or the first_byte_timeout defined for the backend has been reached. The default is 60 seconds.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> / Frands Bjerring Hansen
>>> Head of Technology, Linux
>>> Office: +45 70 40 00 00
>>> frands.hansen@team.blue
>>>
>>> Operations - Linux
>>> team.blue Denmark A/S
>>> Højvangen 4
>>> 8660 Skanderborg
>>> Denmark
>>> CVR: 29412006
>>>
>>>
>>>
>>>
>>> On 15/04/2021, 08.08, "varnish-misc" <varnish-misc-bounces+fbh=zitcom.dk@varnish-cache.org> wrote:
>>>
>>> Hi,
>>>
>>> I need some help understanding why the below 503 is happening.
>>>
>>>
>>>
>>> I am logging 503s to a separate file and then querying as below.
>>>
>>>
>>>
>>> varnishlog -q "BerespStatus eq 503" -w /whatever/file
>>>
>>> varnishlog -r /whatever/file
>>>
>>>
>>>
>>> What's going wrong here ?
>>>
>>> What should I be looking at ?
>>>
>>>
>>>
>>> Please let me know.
>>>
>>>
>>>
>>> * << BeReq >> 45926757
>>>
>>> - Begin bereq 45926756 pass
>>>
>>> - Timestamp Start: 1618461576.793111 0.000000 0.000000
>>>
>>> - BereqMethod POST
>>>
>>> - BereqURL /index.php?&u=85&tr=aa
>>>
>>> - BereqProtocol HTTP/1.1
>>>
>>> - BereqHeader X-Forwarded-Proto: https
>>>
>>> - BereqHeader X-Forwarded-Port: 443
>>>
>>> - BereqHeader Host: graph.com <http://graph.com>
>>>
>>> - BereqHeader X-Amzn-Trace-Id: Root=1-xxx
>>>
>>> - BereqHeader Content-Length: 793
>>>
>>> - BereqHeader sec-ch-ua: "Google Chrome";v="89", "Chromium";v="89", ";Not A Brand";v="99"
>>>
>>> - BereqHeader accept: application/json, text/javascript, */*; q=0.01
>>>
>>> - BereqHeader sec-ch-ua-mobile: ?0
>>>
>>> - BereqHeader user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36
>>>
>>> - BereqHeader content-type: application/x-www-form-urlencoded; charset=UTF-8
>>>
>>> - BereqHeader origin: https://mandy..com
>>>
>>> - BereqHeader sec-fetch-site: same-site
>>>
>>> - BereqHeader sec-fetch-mode: cors
>>>
>>> - BereqHeader sec-fetch-dest: empty
>>>
>>> - BereqHeader referer: https://mandy.com/
>>>
>>> - BereqHeader accept-encoding: gzip, deflate, br
>>>
>>> - BereqHeader accept-language: en-US,en;q=0.9
>>>
>>> - BereqHeader X-Forwarded-For: 103.67.157.20, 10.0.0.170
>>>
>>> - BereqHeader browser: other
>>>
>>> - BereqHeader serverIp: 10.0.3.237
>>>
>>> - BereqHeader serverId: abc01
>>>
>>> - BereqHeader X-Varnish: 45926757
>>>
>>> - VCL_call BACKEND_FETCH
>>>
>>> - VCL_return fetch
>>>
>>> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080 127.0.0.1 56176
>>>
>>> - BackendStart 127.0.0.1 8080
>>>
>>> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
>>>
>>> - FetchError http first read error: EOF
>>>
>>> - BackendClose 32 reload_2021-04-13T130756.default
>>>
>>> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
>>>
>>> - Timestamp Error: 1618461577.074434 0.281323 0.000004
>>>
>>> - BerespProtocol HTTP/1.1
>>>
>>> - BerespStatus 503
>>>
>>> - BerespReason Service Unavailable
>>>
>>> - BerespReason Backend fetch failed
>>>
>>> - BerespHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>>>
>>> - BerespHeader Server: Varnish
>>>
>>> - VCL_call BACKEND_ERROR
>>>
>>> - BerespHeader Content-Type: text/html; charset=utf-8
>>>
>>> - BerespHeader Retry-After: 5
>>>
>>> - VCL_return deliver
>>>
>>> - Storage malloc Transient
>>>
>>> - ObjProtocol HTTP/1.1
>>>
>>> - ObjStatus 503
>>>
>>> - ObjReason Backend fetch failed
>>>
>>> - ObjHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
>>>
>>> - ObjHeader Server: Varnish
>>>
>>> - ObjHeader Content-Type: text/html; charset=utf-8
>>>
>>> - ObjHeader Retry-After: 5
>>>
>>> - Length 1288
>>>
>>> - BereqAcct 948 793 1741 0 0 0
>>>
>>> - End
>>>
>>>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Understanding 503s [ In reply to ]
Thank you Dridi.
This is very helpful.

FYI - My apache keepalive is
KeepAliveTimeout 3

You would suggest increasing this to 5-10 ?

We had lowered the KeepAliveTimeout as the server is a very busy one and we
want to handle many connections.


On Thu, 15 Apr 2021 at 12:51, Dridi Boukelmoune <dridi@varni.sh> wrote:

> Hello,
>
> For global timeouts:
>
>
> https://varnish-cache.org/docs/trunk/reference/varnishd.html#run-time-parameters
>
> They contain "timeout" in the name.
>
> For VCL-defined timeouts:
>
>
> https://varnish-cache.org/docs/trunk/reference/vcl-backend.html#timeout-attributes
>
> https://varnish-cache.org/docs/trunk/reference/vcl-probe.html#attribute-timeout
> https://varnish-cache.org/docs/trunk/reference/vcl-var.html
>
> If the problem was a timeout, you would see this in the log:
>
> FetchError first byte timeout
>
> If you are using apache httpd there´s a good chance that incoming
> connections are closed after 5s of inactivity by default. But when
> varnish pools backend connections, the default backend_idle_timeout is
> 60s so you may end up reusing a connection that was closed by your
> httpd server. You should increase your keep-alive timeout to meet
> varnish´s expectations.
>
> https://httpd.apache.org/docs/2.4/mod/core.html#keepalivetimeout
>
> Please note that varnish has a similar timeout_idle that defaults to
> 5s too. So stacking varnish servers can lead to the same problem if
> you rely on the defaults.
>
> Dridi
>
> On Thu, Apr 15, 2021 at 6:27 AM Maninder Singh <mandys@gmail.com> wrote:
> >
> > Also, backend is apache 2.4
> > and running php-fpm.
> >
> > On Thu, 15 Apr 2021 at 11:52, Maninder Singh <mandys@gmail.com> wrote:
> >>
> >> I have that defined as 2 minutes.
> >> backend default {
> >> .host = "127.0.0.1";
> >> .port = "8080";
> >> .first_byte_timeout = 120s;
> >> }
> >>
> >> That's why this error is puzzling.
> >>
> >> Any other timeouts ( connect ? ) etc to look at ?
> >>
> >> Also, in the above dump, how much time did it take ?
> >>
> >> To me it looks like it was closed within a second ?
> >>
> >>
> >> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
> 127.0.0.1 56176
> >> - BackendStart 127.0.0.1 8080
> >> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
> >> - FetchError http first read error: EOF
> >> - BackendClose 32 reload_2021-04-13T130756.default
> >> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
> >> - Timestamp Error: 1618461577.074434 0.281323 0.000004
> >>
> >> On Thu, 15 Apr 2021 at 11:42, Frands Bjerring Hansen
> <frands.hansen@team.blue> wrote:
> >>>
> >>> Look at the fetch error: http first read error: EOF
> >>>
> >>>
> >>>
> >>> Perhaps the backend closed the connection before sending any data or
> the first_byte_timeout defined for the backend has been reached. The
> default is 60 seconds.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> / Frands Bjerring Hansen
> >>> Head of Technology, Linux
> >>> Office: +45 70 40 00 00
> >>> frands.hansen@team.blue
> >>>
> >>> Operations - Linux
> >>> team.blue Denmark A/S
> >>> Højvangen 4
> >>> 8660 Skanderborg
> >>> Denmark
> >>> CVR: 29412006
> >>>
> >>>
> >>>
> >>>
> >>> On 15/04/2021, 08.08, "varnish-misc" <varnish-misc-bounces+fbh=
> zitcom.dk@varnish-cache.org> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I need some help understanding why the below 503 is happening.
> >>>
> >>>
> >>>
> >>> I am logging 503s to a separate file and then querying as below.
> >>>
> >>>
> >>>
> >>> varnishlog -q "BerespStatus eq 503" -w /whatever/file
> >>>
> >>> varnishlog -r /whatever/file
> >>>
> >>>
> >>>
> >>> What's going wrong here ?
> >>>
> >>> What should I be looking at ?
> >>>
> >>>
> >>>
> >>> Please let me know.
> >>>
> >>>
> >>>
> >>> * << BeReq >> 45926757
> >>>
> >>> - Begin bereq 45926756 pass
> >>>
> >>> - Timestamp Start: 1618461576.793111 0.000000 0.000000
> >>>
> >>> - BereqMethod POST
> >>>
> >>> - BereqURL /index.php?&u=85&tr=aa
> >>>
> >>> - BereqProtocol HTTP/1.1
> >>>
> >>> - BereqHeader X-Forwarded-Proto: https
> >>>
> >>> - BereqHeader X-Forwarded-Port: 443
> >>>
> >>> - BereqHeader Host: graph.com <http://graph.com>
> >>>
> >>> - BereqHeader X-Amzn-Trace-Id: Root=1-xxx
> >>>
> >>> - BereqHeader Content-Length: 793
> >>>
> >>> - BereqHeader sec-ch-ua: "Google Chrome";v="89",
> "Chromium";v="89", ";Not A Brand";v="99"
> >>>
> >>> - BereqHeader accept: application/json, text/javascript, */*;
> q=0.01
> >>>
> >>> - BereqHeader sec-ch-ua-mobile: ?0
> >>>
> >>> - BereqHeader user-agent: Mozilla/5.0 (Windows NT 10.0; Win64;
> x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128
> Safari/537.36
> >>>
> >>> - BereqHeader content-type: application/x-www-form-urlencoded;
> charset=UTF-8
> >>>
> >>> - BereqHeader origin: https://mandy..com
> >>>
> >>> - BereqHeader sec-fetch-site: same-site
> >>>
> >>> - BereqHeader sec-fetch-mode: cors
> >>>
> >>> - BereqHeader sec-fetch-dest: empty
> >>>
> >>> - BereqHeader referer: https://mandy.com/
> >>>
> >>> - BereqHeader accept-encoding: gzip, deflate, br
> >>>
> >>> - BereqHeader accept-language: en-US,en;q=0.9
> >>>
> >>> - BereqHeader X-Forwarded-For: 103.67.157.20, 10.0.0.170
> >>>
> >>> - BereqHeader browser: other
> >>>
> >>> - BereqHeader serverIp: 10.0.3.237
> >>>
> >>> - BereqHeader serverId: abc01
> >>>
> >>> - BereqHeader X-Varnish: 45926757
> >>>
> >>> - VCL_call BACKEND_FETCH
> >>>
> >>> - VCL_return fetch
> >>>
> >>> - BackendOpen 32 reload_2021-04-13T130756.default 127.0.0.1 8080
> 127.0.0.1 56176
> >>>
> >>> - BackendStart 127.0.0.1 8080
> >>>
> >>> - Timestamp Bereq: 1618461577.074387 0.281276 0.281276
> >>>
> >>> - FetchError http first read error: EOF
> >>>
> >>> - BackendClose 32 reload_2021-04-13T130756.default
> >>>
> >>> - Timestamp Beresp: 1618461577.074430 0.281319 0.000043
> >>>
> >>> - Timestamp Error: 1618461577.074434 0.281323 0.000004
> >>>
> >>> - BerespProtocol HTTP/1.1
> >>>
> >>> - BerespStatus 503
> >>>
> >>> - BerespReason Service Unavailable
> >>>
> >>> - BerespReason Backend fetch failed
> >>>
> >>> - BerespHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
> >>>
> >>> - BerespHeader Server: Varnish
> >>>
> >>> - VCL_call BACKEND_ERROR
> >>>
> >>> - BerespHeader Content-Type: text/html; charset=utf-8
> >>>
> >>> - BerespHeader Retry-After: 5
> >>>
> >>> - VCL_return deliver
> >>>
> >>> - Storage malloc Transient
> >>>
> >>> - ObjProtocol HTTP/1.1
> >>>
> >>> - ObjStatus 503
> >>>
> >>> - ObjReason Backend fetch failed
> >>>
> >>> - ObjHeader Date: Thu, 15 Apr 2021 04:39:37 GMT
> >>>
> >>> - ObjHeader Server: Varnish
> >>>
> >>> - ObjHeader Content-Type: text/html; charset=utf-8
> >>>
> >>> - ObjHeader Retry-After: 5
> >>>
> >>> - Length 1288
> >>>
> >>> - BereqAcct 948 793 1741 0 0 0
> >>>
> >>> - End
> >>>
> >>>
> >
> > _______________________________________________
> > varnish-misc mailing list
> > varnish-misc@varnish-cache.org
> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
Re: Understanding 503s [ In reply to ]
On Thu, Apr 15, 2021 at 7:27 AM Maninder Singh <mandys@gmail.com> wrote:
>
> Thank you Dridi.
> This is very helpful.
>
> FYI - My apache keepalive is
> KeepAliveTimeout 3
>
> You would suggest increasing this to 5-10 ?

If varnish is httpd´s only client then increase it to 70s. Varnish
will close unused connections after 60s by default, and if it´s really
really busy that gives a 10s window for the actual shutdown to happen.

If there are other direct clients in front of your httpd server, then
decrease backend_idle_timeout in varnishd to 2s, but then you will
force varnish to establish connections more often. This is already the
case of course, but at least that will reduce the risk of reusing a
closed connection and failing backend fetches for this reason.

> We had lowered the KeepAliveTimeout as the server is a very busy one and we want to handle many connections.

I understand, and there´s a good reason to have a low default when you
can´t trust the clients. It boils down to whether your httpd server is
openly accessible to more than just varnish, including potentially
malicious clients.

Dridi
_______________________________________________
varnish-misc mailing list
varnish-misc@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Re: Understanding 503s [ In reply to ]
Apache runs on port 8080 and is not open to the outside world.
All requests are routed through varnish but then not all requests are
cached.

I guess in that case, varnish becomes the only client for apache.
So, I should increase the KeepAliveTimeout.



On Thu, 15 Apr 2021 at 13:45, Dridi Boukelmoune <dridi@varni.sh> wrote:

> On Thu, Apr 15, 2021 at 7:27 AM Maninder Singh <mandys@gmail.com> wrote:
> >
> > Thank you Dridi.
> > This is very helpful.
> >
> > FYI - My apache keepalive is
> > KeepAliveTimeout 3
> >
> > You would suggest increasing this to 5-10 ?
>
> If varnish is httpd´s only client then increase it to 70s. Varnish
> will close unused connections after 60s by default, and if it´s really
> really busy that gives a 10s window for the actual shutdown to happen.
>
> If there are other direct clients in front of your httpd server, then
> decrease backend_idle_timeout in varnishd to 2s, but then you will
> force varnish to establish connections more often. This is already the
> case of course, but at least that will reduce the risk of reusing a
> closed connection and failing backend fetches for this reason.
>
> > We had lowered the KeepAliveTimeout as the server is a very busy one and
> we want to handle many connections.
>
> I understand, and there´s a good reason to have a low default when you
> can´t trust the clients. It boils down to whether your httpd server is
> openly accessible to more than just varnish, including potentially
> malicious clients.
>
> Dridi
>
Re: Understanding 503s [ In reply to ]
I finally figured out why this was happening.

Hope this helps someone.

We were running php-fpm and had the following configuration.

pm = dynamic
pm.max_children = 166
pm.start_servers = 16
pm.min_spare_servers = 8
pm.max_spare_servers = 16

This was working fine with the usual load.
But, we found that whenever there was a spike it led to an increase in 503s.

This was due to start_servers set to 16.

php-fpm takes a sec to spawn more processes and during that time we see
503s.

For a high traffic site ( like ours ), we had to set this to
pm = static
pm.max_children = 125

The above values are kept keeping in mind our RAM.
These would be different for others.

Now, we don't see any 503s as the server is prepared to handle more
connections.

Thanks,

On Thu, 15 Apr 2021 at 14:03, Maninder Singh <mandys@gmail.com> wrote:

> Apache runs on port 8080 and is not open to the outside world.
> All requests are routed through varnish but then not all requests are
> cached.
>
> I guess in that case, varnish becomes the only client for apache.
> So, I should increase the KeepAliveTimeout.
>
>
>
> On Thu, 15 Apr 2021 at 13:45, Dridi Boukelmoune <dridi@varni.sh> wrote:
>
>> On Thu, Apr 15, 2021 at 7:27 AM Maninder Singh <mandys@gmail.com> wrote:
>> >
>> > Thank you Dridi.
>> > This is very helpful.
>> >
>> > FYI - My apache keepalive is
>> > KeepAliveTimeout 3
>> >
>> > You would suggest increasing this to 5-10 ?
>>
>> If varnish is httpd´s only client then increase it to 70s. Varnish
>> will close unused connections after 60s by default, and if it´s really
>> really busy that gives a 10s window for the actual shutdown to happen.
>>
>> If there are other direct clients in front of your httpd server, then
>> decrease backend_idle_timeout in varnishd to 2s, but then you will
>> force varnish to establish connections more often. This is already the
>> case of course, but at least that will reduce the risk of reusing a
>> closed connection and failing backend fetches for this reason.
>>
>> > We had lowered the KeepAliveTimeout as the server is a very busy one
>> and we want to handle many connections.
>>
>> I understand, and there´s a good reason to have a low default when you
>> can´t trust the clients. It boils down to whether your httpd server is
>> openly accessible to more than just varnish, including potentially
>> malicious clients.
>>
>> Dridi
>>
>
Re: Understanding 503s [ In reply to ]
Thanks for taking the time to debrief, I'm sure that will be useful in the
future!

--
Guillaume Quintard

On Fri, Apr 23, 2021, 00:04 Maninder Singh <mandys@gmail.com> wrote:

> I finally figured out why this was happening.
>
> Hope this helps someone.
>
> We were running php-fpm and had the following configuration.
>
> pm = dynamic
> pm.max_children = 166
> pm.start_servers = 16
> pm.min_spare_servers = 8
> pm.max_spare_servers = 16
>
> This was working fine with the usual load.
> But, we found that whenever there was a spike it led to an increase in
> 503s.
>
> This was due to start_servers set to 16.
>
> php-fpm takes a sec to spawn more processes and during that time we see
> 503s.
>
> For a high traffic site ( like ours ), we had to set this to
> pm = static
> pm.max_children = 125
>
> The above values are kept keeping in mind our RAM.
> These would be different for others.
>
> Now, we don't see any 503s as the server is prepared to handle more
> connections.
>
> Thanks,
>
> On Thu, 15 Apr 2021 at 14:03, Maninder Singh <mandys@gmail.com> wrote:
>
>> Apache runs on port 8080 and is not open to the outside world.
>> All requests are routed through varnish but then not all requests are
>> cached.
>>
>> I guess in that case, varnish becomes the only client for apache.
>> So, I should increase the KeepAliveTimeout.
>>
>>
>>
>> On Thu, 15 Apr 2021 at 13:45, Dridi Boukelmoune <dridi@varni.sh> wrote:
>>
>>> On Thu, Apr 15, 2021 at 7:27 AM Maninder Singh <mandys@gmail.com> wrote:
>>> >
>>> > Thank you Dridi.
>>> > This is very helpful.
>>> >
>>> > FYI - My apache keepalive is
>>> > KeepAliveTimeout 3
>>> >
>>> > You would suggest increasing this to 5-10 ?
>>>
>>> If varnish is httpd´s only client then increase it to 70s. Varnish
>>> will close unused connections after 60s by default, and if it´s really
>>> really busy that gives a 10s window for the actual shutdown to happen.
>>>
>>> If there are other direct clients in front of your httpd server, then
>>> decrease backend_idle_timeout in varnishd to 2s, but then you will
>>> force varnish to establish connections more often. This is already the
>>> case of course, but at least that will reduce the risk of reusing a
>>> closed connection and failing backend fetches for this reason.
>>>
>>> > We had lowered the KeepAliveTimeout as the server is a very busy one
>>> and we want to handle many connections.
>>>
>>> I understand, and there´s a good reason to have a low default when you
>>> can´t trust the clients. It boils down to whether your httpd server is
>>> openly accessible to more than just varnish, including potentially
>>> malicious clients.
>>>
>>> Dridi
>>>
>> _______________________________________________
> varnish-misc mailing list
> varnish-misc@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>