Mailing List Archive

Regarding url purging
Hello.

I'm caching pages from different vhosts, say x.com and y.com.

If I want to purge all the data from one of the vhosts how do I do
it? "url.purge x.com.*" does not work. I'm a bit confused with the
syntax since the wiki only talks about paths and not complete URLs.

Thanks for the help,
Andr?
Regarding url purging [ In reply to ]
On Thursday 05 July 2007 19:39, Andr? Cruz wrote:
> Hello.
>
> I'm caching pages from different vhosts, say x.com and y.com.
>
> If I want to purge all the data from one of the vhosts how do I do
> it? "url.purge x.com.*" does not work. I'm a bit confused with the
> syntax since the wiki only talks about paths and not complete URLs.

That's not easily done, as far as I have been able to determine.
I've submitted a feature request for it..

As I understand it, you can only use hostname as part of the url when you
purge by http, but then no wildcards.
You can use full regexps when purging on the console, but then only the path
is available to match against.

I guess a workaround would be parse the log to make a list of all urls served
within the expiery limit, do your filtering on that list, and then purge all
the matching urls one by one by http.

The nsca log contains the full url including host, which i think is somewhat
unusual, so grepping for that should not be too hard.

Here is the pipe I use to get host as a separate column, that should be a
start.

/usr/bin/varnishncsa -c -r /var/log/varnish/varnish.log | \
sed -e 's#\("\([A-Z]\+\) \(http://\([^/]*\)\)\(.*\)\)#\4 "\2 \5#g' \


Gaute
Regarding url purging [ In reply to ]
On 2007/07/05, at 19:13, Gaute Amundsen wrote:

> That's not easily done, as far as I have been able to determine.
> I've submitted a feature request for it..
>

Where is it so that I can vote for it? :) Do you know if it's
targeted for 1.1?

> As I understand it, you can only use hostname as part of the url
> when you
> purge by http, but then no wildcards.
> You can use full regexps when purging on the console, but then only
> the path
> is available to match against.
>

Eeck.

> I guess a workaround would be parse the log to make a list of all
> urls served
> within the expiery limit, do your filtering on that list, and then
> purge all
> the matching urls one by one by http.
>

Thanks for your help but I think I'll wait either for regexp on http
purging or full urls on console purging, which seem like basic
functions to have anyway.

Regards,
Andr? Cruz
Regarding url purging [ In reply to ]
On Thursday 05 July 2007 20:30, Andr? Cruz wrote:
> On 2007/07/05, at 19:13, Gaute Amundsen wrote:
> > That's not easily done, as far as I have been able to determine.
> > I've submitted a feature request for it..
>
> Where is it so that I can vote for it? :) Do you know if it's
> targeted for 1.1?

http://varnish.projects.linpro.no/ticket/116
1.1? No idea. Don't think there is any voting system either..
Except perhaps bribing DES with a Mac ;-)

> > As I understand it, you can only use hostname as part of the url
> > when you
> > purge by http, but then no wildcards.
> > You can use full regexps when purging on the console, but then only
> > the path
> > is available to match against.
>
> Eeck.
>
> > I guess a workaround would be parse the log to make a list of all
> > urls served
> > within the expiery limit, do your filtering on that list, and then
> > purge all
> > the matching urls one by one by http.
>
> Thanks for your help but I think I'll wait either for regexp on http
> purging or full urls on console purging, which seem like basic
> functions to have anyway.

Hm, well, varnish is a pretty young project still you know :)
I think the majority of users so far are "one site shops", or can reliably
predict what wil have to be purged when something is updated..

Myself I'd rather have better url rewriting to be honest,
http://varnish.projects.linpro.no/ticket/77
then I could phase out apache from our current varnish->apache->zope setup.

Gaute
Regarding url purging [ In reply to ]
In message <200707052013.24760.gaute at pht.no>, Gaute Amundsen writes:
>On Thursday 05 July 2007 19:39, Andr=E9 Cruz wrote:

>> I'm caching pages from different vhosts, say x.com and y.com.
>>
>> If I want to purge all the data from one of the vhosts how do I do
>> it? "url.purge x.com.*" does not work. I'm a bit confused with the
>> syntax since the wiki only talks about paths and not complete URLs.
>
>That's not easily done, as far as I have been able to determine.
>I've submitted a feature request for it..

And I've been thinking about it and may have come up with a solution:

When we hash, we hash on a string that by default contains

req.http.host "#" req.url "#"

In your case for instance it owuld be:

x.com#/#

It looks like purges need to (be able) match against this string,
rather than the url alone.

I'm not sure what the exact form of the solution will be, but I
may simply add another CLI command:

hash.purge $regexp

That matches against the hash string.

The caveat of this is, that if the user customizes the hash string
with vcl_hash(), he has to take this into account with his purge
strings.

But I can see how that could be an advantage also: one could "tag"
object in vcl_hash with a class or token, which can later be used
to purge the tagged objects.

Do we have a ticket for this in trac already ?

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Regarding url purging [ In reply to ]
On Friday 06 July 2007 12:21, Poul-Henning Kamp wrote:
> In message <200707052013.24760.gaute at pht.no>, Gaute Amundsen writes:
> >On Thursday 05 July 2007 19:39, Andr=E9 Cruz wrote:
> >> I'm caching pages from different vhosts, say x.com and y.com.
> >>
> >> If I want to purge all the data from one of the vhosts how do I do
> >> it? "url.purge x.com.*" does not work. I'm a bit confused with the
> >> syntax since the wiki only talks about paths and not complete URLs.
> >
> >That's not easily done, as far as I have been able to determine.
> >I've submitted a feature request for it..
>
> And I've been thinking about it and may have come up with a solution:
>
> When we hash, we hash on a string that by default contains
>
> req.http.host "#" req.url "#"
>
> In your case for instance it owuld be:
>
> x.com#/#
>
> It looks like purges need to (be able) match against this string,
> rather than the url alone.
>
> I'm not sure what the exact form of the solution will be, but I
> may simply add another CLI command:
>
> hash.purge $regexp
>
> That matches against the hash string.
>
> The caveat of this is, that if the user customizes the hash string
> with vcl_hash(), he has to take this into account with his purge
> strings.
>
> But I can see how that could be an advantage also: one could "tag"
> object in vcl_hash with a class or token, which can later be used
> to purge the tagged objects.
>
> Do we have a ticket for this in trac already ?

That looks like a perfectly fine way to do it as far as my needs go.

Beeing able to saw off the branch youre sitting on, is not an unreasonable
possibility to expect if you want to play with powertools :)

Anyway I'd expect it would be possible to craft a regexp on the #'s that would
isolate any additions to the hash from changing the match, if that's what you
wanted..

Ticket?
This one I guess: http://varnish.projects.linpro.no/ticket/116

Thanks :)

Gaute

--
Programmerer - Pixelhospitalet AS
T?rkoppveien 10, 1570 Dilling
Tlf. 24 12 97 81 - 9074 7344