Mailing List Archive: Varnish Dirty Caching

Varnish Dirty Caching

varnish at held-im-ruhestand

Jul 2, 2007, 2:14 PM

Post #1 of 7 (1066 views)

hi,

i'd like to implement dirty-caching using varnish.

So what is dirty caching and why use it? Think of a very unreliable
backend. If varnish can't reach it's backend, it will simply return the
last content it has (even if the content is stale). That way i can cover
hickups.

Is this possible? What are the side effects?

As far as i understood varnish the "normal" configurations works like
this

request comes in

try to find it in cache

if obj.cacheable (missing documentation on the exact meaning of
cacheable, probably a check is Expires Header is still in the future) then
return this object to client (will this update the age header?)

fetch object from backend. If cacheable insert into cache.

Return object to client.

Now after some time comes the reaper. If an object expired the reaper
will call vcl_timeout. vcl_timeout will either discard the object, or it
will fetch an update.
If i discard it on timeout, this will keep my cache tidy. If i always
fetch an update it will constantly keep a complete copy of my backend
(at least the part that was hit once). Both options seem bad. Keeping it
tidy will force a complete retrieval of the object, even if it didn't
change on the backend (just new expire headers). Keeping a copy will
hammer my backend with requests for files that are normaly hit every ten
years.

so i'm slightly confused and looking for some documentation...

Greetings
Christoph

Varnish Dirty Caching [ In reply to ]

Jul 3, 2007, 2:28 AM

Post #2 of 7 (1053 views)

Christoph <varnish at held-im-ruhestand.de> writes:
> So what is dirty caching and why use it? Think of a very unreliable
> backend. If varnish can't reach it's backend, it will simply return
> the last content it has (even if the content is stale). That way i can
> cover hickups.

It's on our list for 2.0, and will probably hit trunk in late July.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no

Varnish Dirty Caching [ In reply to ]

denis at startsiden

Jul 3, 2007, 2:32 AM

Post #3 of 7 (1055 views)

----- Dag-Erling Sm?rgrav <des at linpro.no> wrote:
> Christoph <varnish at held-im-ruhestand.de> writes:
> > So what is dirty caching and why use it? Think of a very unreliable
> > backend. If varnish can't reach it's backend, it will simply return
> > the last content it has (even if the content is stale). That way i
> > can cover hickups.
> It's on our list for 2.0, and will probably hit trunk in late July.

Way cool, really looking forward to that feature! As I stated in a different thread it's more often other parts of the system that break than varnish so having such a failsafe enables us to do a lot more fault tolerant setups.

Regards
--
Denis Braekhus - Teknisk Ansvarlig ABC Startsiden AS
http://www.startsiden.no

Varnish Dirty Caching [ In reply to ]

Jul 3, 2007, 3:06 AM

Post #4 of 7 (1060 views)

In message <20070702211450.GA16119 at falcon>, Christoph writes:

>i'd like to implement dirty-caching using varnish.

I'm busy twisting the variable visibility in VCL into proper shape
right now, and that will move us a bit closer to what your want
to do.

The critical question is how we define "backend is down" and how
fast and efficient we can detect it.

Ideas for how to express it in VCL are very welcome.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

Varnish Dirty Caching [ In reply to ]

dwetzel at nerim

Jul 3, 2007, 3:33 AM

Post #5 of 7 (1053 views)

Hello all,

when working for speedera, we use to have the ability to define
two time-out timers in the config: timer1 and timer2

timer1 would define the time to receive the TCP ack of the http
request to the backend. (or something equivalent)

timer2 the time to receive the first chunk of http data

upon completion of one of these timers we could configure the caches
to say :
1) serve stale content
2) serve a user defined static html page or null gif image
3) redirect the requests to a backup backend.

with savvis we had a variable called Background Refresh Mode when
positionned on a URL would make the cache serve the content
immediatly when it becomes staled and then the cache would (try to) make
the refresh in the background.

Damien,

Poul-Henning Kamp writes:
> In message <20070702211450.GA16119 at falcon>, Christoph writes:
>
> >i'd like to implement dirty-caching using varnish.
>
> I'm busy twisting the variable visibility in VCL into proper shape
> right now, and that will move us a bit closer to what your want
> to do.
>
> The critical question is how we define "backend is down" and how
> fast and efficient we can detect it.
>
> Ideas for how to express it in VCL are very welcome.
>
> --
> Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
> phk at FreeBSD.ORG | TCP/IP since RFC 956
> FreeBSD committer | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc

Varnish Dirty Caching [ In reply to ]

stonorn at giraffen

Jul 3, 2007, 4:25 AM

Post #6 of 7 (1054 views)

Poul-Henning Kamp wrote:

> The critical question is how we define "backend is down" and how
> fast and efficient we can detect it.

Right. I tend to like the Perlbal approach: Issue a http OPTIONS and
check if we get anything back from the backend. It is quite lightweight.

/Anton

Varnish Dirty Caching [ In reply to ]

stonorn at giraffen

Jul 4, 2007, 11:44 PM

Post #7 of 7 (1058 views)

Poul-Henning Kamp wrote:

> The critical question is how we define "backend is down" and how
> fast and efficient we can detect it.
>
> Ideas for how to express it in VCL are very welcome.

Maybe naive:

# First, we setup decide how to "sniff" that a backend is down
#
# options_ping: Send a HTTP OPTIONS (Perlbal does that)
# timeout: If the backend does not answer within x seconds, it is
# probably down
# icp: Abuse the protocol. (Squid + Zope does that)
backend.down_protocol = options_ping | timeout | icp

# What is the timeout limit?
backend.timeout = 30

# How long time should the backend be marked as "down" before we try
# again?
backend.retry_after = 300

# And then just use it

if(backend.down)....

/Anton Stonor