Mailing List Archive

Migration from Squid
Hello,

I have been looking at Varnish as a replacement to our Squid
infrastructure for a little while now. I am very impressed with the
application and interested in implemented Varnish as a test for some
of our infrastructure. Reading through the wiki there are three things
that I have not figured out how to do / or if it's even possible which
I've detailed below.

1. Cache Peer
- There are two use cases for the cache peer for our sites. The
first is the proxy only cache peer where the system will check another
proxy for an object and retrieve that object from cache vs. requesting
it from the origin server. This has proven to be extremely effective
at reducing the overall disk footprint of our caches while maintaining
a low hit rate on the origin server. The second of course is querying
the cache on another proxy, fetching and then caching on the local
box.

2. Redirect / Rewrite
- Obviously running a redirector / rewrite application via a perl
script isn't ideal for performance but has been proven to be an
amazing resource when migrating CMS platforms or to work around
"features" of a specific application platform.

3. Header Replace
- By default Squid enforces cache policy based on headers served
from the origin system. In some cases we need to then change those
headers when returning data to the client browser. An example would be
to modify the cache-control and expires headers to instruct the
browser not to cache. For a given site we could be setting this as a
global value, for specific URL patterns / directories, or for file
extensions.

Does Varnish support this currently? If not is it on the roadmap?

Thanks,
Max
Migration from Squid [ In reply to ]
"Max Clark" <max.clark at gmail.com> writes:

> Reading through the wiki there are three things that I have not
> figured out how to do / or if it's even possible which I've detailed
> below.
>
> 1. Cache Peer

> - There are two use cases for the cache peer for our sites. The
> first is the proxy only cache peer where the system will check
> another proxy for an object and retrieve that object from cache
> vs. requesting it from the origin server. This has proven to be
> extremely effective at reducing the overall disk footprint of our
> caches while maintaining a low hit rate on the origin server.
>
> The second of course is querying the cache on another proxy,
> fetching and then caching on the local box.

Sibling cache_peers is not a feature in Varnish. These are features
used to increase squid's performance by clustering, and implementing
different methods of retrieving cached data from another cache.

The functionality of parent cache_peers overlap roughly with the
backend declarations of varnish.

You will most likely will see bottlenecs in other places with varnish
than with squid. You can, of course, put varnish in front of varnish,
or load balance between varnish instances if the impact of losing your
cache is too big, and you need to safeguard against this.

The cache_peer functionality of squid is a very nice feature if you
are using squid, but you may design your caching in another way with
varnish.

> 2. Redirect / Rewrite

> - Obviously running a redirector / rewrite application via a perl
> script isn't ideal for performance but has been proven to be an
> amazing resource when migrating CMS platforms or to work around
> "features" of a specific application platform.

As long as you don't execute perl script for every request, you should
in theory have quite good performance with this method of rewriting.

With the released (up to 1.1.2) versions of varnish, it seems you have
to use an external url rewriter. While I have used apache, lighttpd
and nginx for this, there may be other, and better, alternatives
available.

> 3. Header Replace
> - By default Squid enforces cache policy based on headers served
> from the origin system. In some cases we need to then change those
> headers when returning data to the client browser. An example would be
> to modify the cache-control and expires headers to instruct the
> browser not to cache. For a given site we could be setting this as a
> global value, for specific URL patterns / directories, or for file
> extensions.
>
> Does Varnish support this currently? If not is it on the roadmap?

Yes, these examples from the vcl man page shows how to manipulate
headers:

,----[ vcl(7) ]
|
| sub vcl_recv {
| # Normalize the Host: header
| if (req.http.host ~ "^(www.)?example.com$") {
| set req.http.host = "www.example.com";
| }
| }
| [...]
| sub vcl_fetch {
| # Don?t cache cookies
| remove obj.http.Set-Cookie;
| }
|
`----

You can manipulate headers received from the client (req.http.*), as
well as the request headers sent to the backend (bereq.http.*), object
headers retrieved from cache or backend (obj.http.*) and response
headers (resp.http.*) sent to the client in the same way.

--
Stig Sandbeck Mathisen, Linpro
Migration from Squid [ In reply to ]
Thanks for the response... is "cache clustering" on the roadmap for
Varnish? The sibling cache_peer configuration has been an excellent
performance boost for us but my primary concern is more resource
utilization. If the cache server has less resources than the origin
server on a busy site then the object expiration on the cache will
increase and as a result hits against the origin. Placing another
cache inbetween the external cache and the origin is an interesting
idea but adds to much additional complexity for my taste.

-Max

2008/1/18 Stig Sandbeck Mathisen <ssm at linpro.no>:
> "Max Clark" <max.clark at gmail.com> writes:
>
> > Reading through the wiki there are three things that I have not
> > figured out how to do / or if it's even possible which I've detailed
> > below.
> >
> > 1. Cache Peer
>
> > - There are two use cases for the cache peer for our sites. The
> > first is the proxy only cache peer where the system will check
> > another proxy for an object and retrieve that object from cache
> > vs. requesting it from the origin server. This has proven to be
> > extremely effective at reducing the overall disk footprint of our
> > caches while maintaining a low hit rate on the origin server.
> >
> > The second of course is querying the cache on another proxy,
> > fetching and then caching on the local box.
>
> Sibling cache_peers is not a feature in Varnish. These are features
> used to increase squid's performance by clustering, and implementing
> different methods of retrieving cached data from another cache.
>
> The functionality of parent cache_peers overlap roughly with the
> backend declarations of varnish.
>
> You will most likely will see bottlenecs in other places with varnish
> than with squid. You can, of course, put varnish in front of varnish,
> or load balance between varnish instances if the impact of losing your
> cache is too big, and you need to safeguard against this.
>
> The cache_peer functionality of squid is a very nice feature if you
> are using squid, but you may design your caching in another way with
> varnish.
>
> > 2. Redirect / Rewrite
>
> > - Obviously running a redirector / rewrite application via a perl
> > script isn't ideal for performance but has been proven to be an
> > amazing resource when migrating CMS platforms or to work around
> > "features" of a specific application platform.
>
> As long as you don't execute perl script for every request, you should
> in theory have quite good performance with this method of rewriting.
>
> With the released (up to 1.1.2) versions of varnish, it seems you have
> to use an external url rewriter. While I have used apache, lighttpd
> and nginx for this, there may be other, and better, alternatives
> available.
>
> > 3. Header Replace
> > - By default Squid enforces cache policy based on headers served
> > from the origin system. In some cases we need to then change those
> > headers when returning data to the client browser. An example would be
> > to modify the cache-control and expires headers to instruct the
> > browser not to cache. For a given site we could be setting this as a
> > global value, for specific URL patterns / directories, or for file
> > extensions.
> >
> > Does Varnish support this currently? If not is it on the roadmap?
>
> Yes, these examples from the vcl man page shows how to manipulate
> headers:
>
> ,----[ vcl(7) ]
> |
> | sub vcl_recv {
> | # Normalize the Host: header
> | if (req.http.host ~ "^(www.)?example.com$") {
> | set req.http.host = "www.example.com";
> | }
> | }
> | [...]
> | sub vcl_fetch {
> | # Don't cache cookies
> | remove obj.http.Set-Cookie;
> | }
> |
> `----
>
> You can manipulate headers received from the client (req.http.*), as
> well as the request headers sent to the backend (bereq.http.*), object
> headers retrieved from cache or backend (obj.http.*) and response
> headers (resp.http.*) sent to the client in the same way.
>
> --
> Stig Sandbeck Mathisen, Linpro
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>