Mailing List Archive: hit-for-pass vs. hit-for-miss

hit-for-pass vs. hit-for-miss

slink at schokola

Sep 2, 2016, 11:10 AM

Post #1 of 15 (3631 views)

(quick brain dump before I need to rush out)

Geoff discovered this interesting consequence of a recent important change of
phk and we just spent an hour to discuss this:

before commit 9f272127c6fba76e6758d7ab7ba6527d9aad98b0, a hit-for-pass object
lead to a pass, not it's a miss. IIUC the discussions we had on a trip to
Amsterdam, phks main motivation was to eliminate the potentially deadly effect
unintentionally created hfps had on cache efficiency: No matter what, for the
lifetime of the hfp, all requests hitting that object became passes.

so, in short

- previously: an uncacheable response wins and sticks for its ttl
- now: an cacheable response wins and sticks for its ttl

or eben shorter:

- previously: hit-for-pass
- now: hit-for-miss

From the perspective of a cache, the "now" case seems clearly favorable, but now
Geoff has discovered that the reverse is true for a case which is important to
one of our projects:

- varnish is running in "do how the backend says" mode
- backend devs know when to make responses uncacheable
- a huge (600MB) backend response is uncacheable, but client-validatable

so this is the case for the previous semantics:

- 1st request creates the hfp
- 2nd request from client carries INM
- gets passed with INM
- 304 from backend goes to client

What we have now is:

- 1st request creates the hfm (hit-for-miss)
- 2nd request is a miss
- INM gets removed
- backend sends 600MB unnecessarily

We've thought about a couple of options which I want to write down before they
expire from my cache:

* decide in vcl_hit

sub vcl_hit {
if (obj.uncacheable) {
if (obj.http.Criterium) {
return (miss);
} else {
return (pass);
}
}
}

* Do not strip INM/IMS for miss and have a bereq property if it was a misspass

- core code keeps INM/IMS
- builtin.vcl strips them in vcl_miss
- can check for hitpass in vcl_miss
- any 304 backend response forced as uncacheable
- interesting detail: can it still create a hfp object ?

BUT: how would we know in vcl_miss if we see
*client* inm/ims or varnish-generated inm/ims ?

So at this point I only see the YAS option.

Nils

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 2, 2016, 12:21 PM

Post #2 of 15 (3619 views)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 09/02/2016 08:10 PM, Nils Goroll wrote:
>
> - varnish is running in "do how the backend says" mode - backend
> devs know when to make responses uncacheable

Just so that you know what that's all about, to understand the use case:

We have a project at which TTLs for caching are determined almost
exclusively from Cache-Control (there are some exceptions, but that is
far and away the most common case).

This means that we run Varnish with -t 0, and set beresp.ttl=0s if
neither of Cache-Control or Expires are present in a response. So
setting a TTL for caching is entirely the responsibility of the devs.

That means in turn that we do not have long sequences of regexen in
vcl_recv to decide: lookup for these and pass for those. I'm sure
everyone knows how unmaintainable that sort of thing can become, and
for us, it's absolutely out of the question. There are far too many
dev teams, choosing and changing their URL patterns all the time, we
could never keep the patterns up to date, nor do we want Varnish to do
all that regexing on every request.

If you think about it, this is the way the world should be -- devs are
fully responsible for setting TTLs with Cache-Control.

But what that means is that we have no way of knowing in advance which
requests will be passed -- meaning, we don't know it in vcl_recv
solely on the basis of the client request. I'd have to check, but it
wouldn't surprise me if "return(pass)" does not appear anywhere in our
VCL.

Setting beresp.uncacheable is the only way we can determine that
subsequent requests for the same URL can be passed.

That leads us to the problem that Nils described -- prior to the
change that he mentioned, requests with bereq.uncacheable were pass,
so req headers were filtered into the bereq as in the pass case. In
particular, conditional headers (IMS/INM) were not filtered out. "If
the client asks for validation, and we're passing, then we ask the
backend for validation, and if the backend says 304, pass it along."

After the change, requests with bereq.uncacheable are misses, so the
conditional headers are filtered out.

So our situation now is:

* Setting beresp.uncacheable is the only way we can know that a
request can be passed.

* But now it has become impossible to pass along conditional requests
under these circumstances, even if both the client and backend are
able to everything right with IMS/INM and Last-Modified/ETag.

I'll let Nils' mail describe the rest. What we're missing now is: set
up the bereq for pass, in particular allowing the conditional headers,
not because we've set it to pass in vcl_recv (which we can't do), but
because we've set beresp.uncacheable for a previous beresp.

Thanks,
Geoff
- --
** * * UPLEX - Nils Goroll Systemoptimierung

ScheffelstraÃŸe 32
22301 Hamburg

Tel +49 40 2880 5731
Mob +49 176 636 90917
Fax +49 40 42949753

http://uplex.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCAAGBQJXydFUAAoJEOUwvh9pJNURK6EP/As7G39CCizW+wHaTGaXbq2w
JjzXRpMPgnsmlTZ24PjUh/e+kih9PxXgBruQdsZ0egMIlQN3tXI77ZyOol0oFfBk
TspB1ANGPw95JS4rTRx4PtTCPHlfXtyY+b07T/QRGRmKkosxOBDVJrtEaQ2xaJ88
GF0p208GROf3VFYtQ05QYfwvyiwX09dopEQmMyrEdkoq9uH3Xrf1d5FWT5OYbEFy
LMHtoRkdVp78qt4hgV8LYG1BJ4AYRKOhbIIVlX7itw12PZuMNoAf+Z+r4RzQX74t
a0fAPkwvkym3XxfxhPu9CceEsKoTOb+zMOUV2+GNO1NP7r25xLRCJ62zoQHzpKs4
SacB2zcecRVnw+/WYtlxseSmUbmzizZu3MLyiCIBlCsTIktgZOMlrUP4++WFckY7
C4uuqd2OrxyhzHTP7BBVSr9+itITU1tNGWE87wZy7yOh9ITb9YFd9esVStTqyItd
fb0Ps+yqaqfxZHDpo+vOh1RIc+GPzSBxt9eDd0nPWBFFjLyr74/gCjYOh12oxwUK
j+gkO31mmfy9umw0bUvF9ZLGytnOGVD0hFVuAQ8DcRt1cGvPOl87nwr5iRXtTxPP
O/4qbrgy66X98kWkB3A+eaCzG1WobinNvwfrsxRkF1FpFXi5xmCVmjSRaF1S6hOW
/IY8lsypBwb5CX7fCvsh
=1Siq
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

slink at schokola

Sep 7, 2016, 10:01 AM

Post #3 of 15 (3608 views)

Hi,

TL;DR please shout if you think you need the choice between hit-for-pass and
hit-for-miss.

On 02/09/16 20:10, Nils Goroll wrote:
> - previously: hit-for-pass
> - now: hit-for-miss

on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)

I would like to understand if bringing back hit-for-pass is a specific
requirement we have (in which case a vmod producing quite some overhead would be
the right thing to do) or if others have more cases which would justify a
generic solution in varnish core like this one:

> sub vcl_hit {
> if (obj.uncacheable) {
> if (obj.http.Criterium) {
> return (miss);
> } else {
> return (pass);
> }
> }
> }

Thank you, Nils

*) using a secondary cache index (maybe as in the xkey vmod), mark objects we
want to pass for in backend_response, check in recv if the object is marked and
return(pass) if so.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 7, 2016, 12:41 PM

Post #4 of 15 (3607 views)

This is worth responding to from vacation.

The "specific requirement we have" is a consequence of applying the HTTP protocol the way it was meant to be used -- responses specify their cacheability, without VCL having to intervene to classify which requests go to lookup or pass. Typically with a sequence of regex matched against URL patterns in vcl_recv.

That may indeed be unusual, but I see that a sad commentary on the state of web developers' knowledge about caching and HTTP. Not as somebody's peculiar requirement.

It would strike me as rather odd if a caching proxy has to treat it as a special case when backends actually do something the right way (like always set Cache-Control to determine TTLs).

Geoff

Sent from my iPhone

> On Sep 7, 2016, at 1:01 PM, Nils Goroll <slink@schokola.de> wrote:
>
> Hi,
>
> TL;DR please shout if you think you need the choice between hit-for-pass and
> hit-for-miss.
>
>
>
>> On 02/09/16 20:10, Nils Goroll wrote:
>> - previously: hit-for-pass
>> - now: hit-for-miss
>
> on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)
>
> I would like to understand if bringing back hit-for-pass is a specific
> requirement we have (in which case a vmod producing quite some overhead would be
> the right thing to do) or if others have more cases which would justify a
> generic solution in varnish core like this one:
>
>> sub vcl_hit {
>> if (obj.uncacheable) {
>> if (obj.http.Criterium) {
>> return (miss);
>> } else {
>> return (pass);
>> }
>> }
>> }
>
> Thank you, Nils
>
>
> *) using a secondary cache index (maybe as in the xkey vmod), mark objects we
> want to pass for in backend_response, check in recv if the object is marked and
> return(pass) if so.
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 7, 2016, 1:00 PM

Post #5 of 15 (3607 views)

--------
In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>, Geoffrey Simmons wr
ites:

>That may indeed be unusual, but I see that a sad commentary on the
>state of web developers' knowledge about caching and HTTP. Not as
>somebody's peculiar requirement.

Oh, man, if you think that is a sad commentary, wait till I get started...

>It would strike me as rather odd if a caching proxy has to treat
>it as a special case when backends actually do something the right
>way (like always set Cache-Control to determine TTLs).

IMO the unusual detail, is that it takes several minutes to fetch
the object from the backend, and that people are trying to find
a way to mitigate/work around that special case.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 7, 2016, 1:02 PM

Post #6 of 15 (3608 views)

--------
In message <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>, Nils Goroll writ
es:

>on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)

Only as a workaround for this corner case where the backend takes minutes
to reply.

It is not my impression that there is an issue with "normal" backend
response times, but correct me if I'm wrong ?

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

slink at schokola

Sep 7, 2016, 1:10 PM

Post #7 of 15 (3607 views)

On 07/09/16 22:02, Poul-Henning Kamp wrote:
> It is not my impression that there is an issue with "normal" backend
> response times, but correct me if I'm wrong ?

The general issue is that for a miss, we should remove IMS/INM, but not for a pass.

So any backend function where validation is efficient while delivery is not will
be hit.

Nils

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 7, 2016, 1:32 PM

Post #8 of 15 (3606 views)

For one thing, I wouldn't call it workaround to use Last-Modified/ETag and IMS/INM to avoid sending a very large response unless it's necessary. Validation is meant to solve just that sort of problem.

But that's not the main thing I'm talking about. We have required devs to always decide about TTLs for caching themselves, including TTL=0, by setting Cache-Control. I know that's unusual, but it's the way things really ought to be.

However, that eliminates the possibility of knowing at recv time that a request can be passed. Again, I know it's common to have something like lists of regexen in vcl_recv to decide what goes to lookup and what goes to pass. But it shouldn't *have* to be that way -- ideally, it shouldn't be necessary at all.

Without HFP, we're left with no way at all of knowing which requests can be passed. Which is bad enough, but it also in turn eliminates the possibility of using IMS/INM for uncacheable responses.

Altogether, it means that we ask devs to use HTTP for caching as the protocol intends, but then it's a problem case for a caching proxy for HTTP (and Nils has indicated that a VMOD might have performance problems). That IMO gets us into Alice in Wonderland territory.

Sent from my iPhone

> On Sep 7, 2016, at 4:00 PM, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
>
> --------
> In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>, Geoffrey Simmons wr
> ites:
>
>> That may indeed be unusual, but I see that a sad commentary on the
>> state of web developers' knowledge about caching and HTTP. Not as
>> somebody's peculiar requirement.
>
> Oh, man, if you think that is a sad commentary, wait till I get started...
>
>> It would strike me as rather odd if a caching proxy has to treat
>> it as a special case when backends actually do something the right
>> way (like always set Cache-Control to determine TTLs).
>
> IMO the unusual detail, is that it takes several minutes to fetch
> the object from the backend, and that people are trying to find
> a way to mitigate/work around that special case.
>
> --
> Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG | TCP/IP since RFC 956
> FreeBSD committer | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 7, 2016, 2:20 PM

Post #9 of 15 (3607 views)

--------
In message <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>, Geoffrey Simmons wr
ites:

The problem with HFP as opposed to HFM is that it is waaaay outside the
specs, our trouble assigning TTLs being a very blunt hint about that.

I'm all for improving stuff in general, and any ideas/patches are
welcome, just dont try to claim that HFP was more RFC- or for that
matter POLA-compliant than HFM...

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

slink at schokola

Sep 8, 2016, 1:26 AM

Post #10 of 15 (3607 views)

On 07/09/16 23:20, Poul-Henning Kamp wrote:
> I'm all for improving stuff in general, and any ideas/patches are
> welcome

what's your opinion about the suggestion to add obj.uncacheable + return
miss/pass in vcl_hit (see vcl mock in my initial email).

The hard part about this is that we currently have an unsolved issue with miss
from hit anyway, which we have last discussed on Monday. A transcript and
summary of my understanding is in here:
https://github.com/varnishcache/varnish-cache/issues/1799

At this point I think that solving this will be key also to getting vcl control
over hfp/hfm.

Nils

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 8, 2016, 1:33 AM

Post #11 of 15 (3608 views)

--------
In message <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>, Nils Goroll writ
es:
>On 07/09/16 23:20, Poul-Henning Kamp wrote:
>> I'm all for improving stuff in general, and any ideas/patches are
>> welcome
>
>what's your opinion about the suggestion to add obj.uncacheable + return
>miss/pass in vcl_hit (see vcl mock in my initial email).

It doesn't "feel" quite right, and it certainly does not seem like
something which is so obviously correct that I feel comfortable
slamming it in a week before a major release...

>The hard part about this is that we currently have an unsolved issue with miss
>from hit anyway, [...]

Yes, that is the tricky one, and like the other one, I don't think we
have anything which is good enough to stuff it in a week before the
release.

The good news is that there is only 6 months until the next release...

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

slink at schokola

Sep 8, 2016, 1:47 AM

Post #12 of 15 (3605 views)

>> On 07/09/16 23:20, Poul-Henning Kamp wrote:
>>> I'm all for improving stuff in general, and any ideas/patches are
>>> welcome
>>
>> what's your opinion about the suggestion to add obj.uncacheable + return
>> miss/pass in vcl_hit (see vcl mock in my initial email).
>
> It doesn't "feel" quite right

What exactly doesn't?

> it certainly does not seem like
> something which is so obviously correct that I feel comfortable
> slamming it in a week before a major release...

This is nothing personal, but the change to hfm with no fallback to the previous
logic doesn't seem so obviously correct that I feel comfortable having a major
release with it.

Nils

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 8, 2016, 1:53 AM

Post #13 of 15 (3608 views)

--------
In message <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>, Nils Goroll writ
es:

>This is nothing personal, but the change to hfm with no fallback
>to the previous logic doesn't seem so obviously correct that I
>feel comfortable having a major release with it.

As I just said: 5.0 isn't going to be anybodys favourite release,
but it is happening anyway, because there are people out there
waiting for the part of it that actually works.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

slink at schokola

Sep 8, 2016, 2:01 AM

Post #14 of 15 (3605 views)

Let's get this back on track:

Personally I don't care whether or not a solution to the hfp vs. hfm problem is
going to be in 5.0 or not. For those people for whom this matters I have raised
concerns, so at least the dev community should be aware of it.

As many people know, I love running master code and I care about having good
code in master, no matter what magic date we have for a certain git branch
command. For the specific project we talked about we have the option to stick to 4.1

So I'm fine with postponing this for a bit, but I want to see #1799 and this one
solved and I'd hope to have made sound suggestions to get us there. As always,
if there are better suggestions, I'm all ears.

Nils

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Re: hit-for-pass vs. hit-for-miss [ In reply to ]

Sep 8, 2016, 2:03 AM

Post #15 of 15 (3608 views)

--------
In message <a3d0e091-0aef-8ae1-1ea1-2ab32f744ddf@schokola.de>, Nils Goroll writ
es:

>So I'm fine with postponing this for a bit, but I want to see #1799 and this one
>solved and I'd hope to have made sound suggestions to get us there. As always,
>if there are better suggestions, I'm all ears.

Agreed.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev