Mailing List Archive

Use with Amazon CloudFront CDN (for images, PDFs, etc.)
Hi,

I work for a company (rfxtechnologies.com) that uses Bricolage to
manage various clients' web sites, at least one of which is quite
media-heavy. We want to use Amazon CloudFront as our CDN for almost
everything except HTML pages: images, PDFs, stylesheets, etc. We use
Amazon S3 buckets for storage, and I've done some of the work to
integrate the S3 mover into Bricolage for version 2.1.0.

Our problem is that when publishing a new version of a file, old
versions of that file are still cached at the CDN endpoints for quite
some time. Potential solutions:

(1) Set a short cache expiration period. Unfortunately Amazon only
lets us set it as short as one hour. Not good when a change must be
published ASAP.

(2) Use something called an invalidation request
<http://bit.ly/pSu8SH>. But you can only have 3 requests at a time
(each with up to 1,000 files) and each takes 15-20 minutes. Seems to
require us to build a second queueing system or graft invalidation
requests into our existing one. Again, not good for when a change
must be published ASAP.

(3) So now I'm looking to use some sort of versioning system on the
files we publish for distribution via CDN. (Remember, we're not using
the CDN for the HTML pages, just everything else.)

My questions are twofold:

(A) Can versioning be accomplished by simply adding support for
something like %{version} to output channel URL formats, or is it more
complex than that?

(B) Does anyone have any other thoughts on this whole thing?

Thanks in advance.
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
Hi Darren,

> (3) So now I'm looking to use some sort of versioning system on the
> files we publish for distribution via CDN. (Remember, we're not using
> the CDN for the HTML pages, just everything else.)

Definitely the way to go. Put one year cache headers, and then when
content changes (new css, js or media), bump the version number in the
url.

Cheers,

Alex
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
I wrote:
>> (3) So now I'm looking to use some sort of versioning system on the
>> files we publish for distribution via CDN.  (Remember, we're not using
>> the CDN for the HTML pages, just everything else.)

On Fri, Nov 18, 2011 at 12:45, Alex Krohn <alex@gt.net> wrote:
> Definitely the way to go. Put one year cache headers, and then when
> content changes (new css, js or media), bump the version number in the
> url.

Good to know. Any thoughts on whether the following is the correct
way to go about it?

I wrote:
> (A) Can versioning be accomplished by simply adding support for
> something like %{version} to output channel URL formats, or is it more
> complex than that?
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
On Nov 18, 2011, at 12:17 PM, Darren Embry wrote:

> Our problem is that when publishing a new version of a file, old
> versions of that file are still cached at the CDN endpoints for quite
> some time. Potential solutions:

This is why, for DesignScene, I’m just using S3 and not CloudFront.

> (A) Can versioning be accomplished by simply adding support for
> something like %{version} to output channel URL formats, or is it more
> complex than that?

IIRC, it will work with any key listed in my_meths in the asset class or any of its parents. And yes, it looks like version is an option in Bric::Biz::Asset:

https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/Biz/Asset.pm#L548

There might be other issues, though; I'm not sure. I don't think the URI will be recalculated in the database for every new version, so finding existing media in the system by their URIs might be tricky.

So, create a test output channel for yourself and TIAS.

HTH,

David
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
Hello,

The company I work with, ChargeSmart (who is also hiring Perl developers, and
already has several CPAN authors including myself and Jon Swartz), also uses
Amazon CloudFront.

We currently get around the old versions problem by putting a version number in
the urls of each static media element, and incrementing it when that changes.
Although not many media elements change but rather get replaced, so they have
different file names, and no version number increment needed.

The key thing is that each distinct media item has a distinct url, however it is
accomplished, and then CloudFront will do the right thing.

-- Darren Duncan

Darren Embry wrote:
> I work for a company (rfxtechnologies.com) that uses Bricolage to
> manage various clients' web sites, at least one of which is quite
> media-heavy. We want to use Amazon CloudFront as our CDN for almost
> everything except HTML pages: images, PDFs, stylesheets, etc. We use
> Amazon S3 buckets for storage, and I've done some of the work to
> integrate the S3 mover into Bricolage for version 2.1.0.
>
> Our problem is that when publishing a new version of a file, old
> versions of that file are still cached at the CDN endpoints for quite
> some time. Potential solutions:
<snip>
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
>> (A) Can versioning be accomplished by simply adding support for
>> something like %{version} to output channel URL formats, or is it more
>> complex than that?
>
> IIRC, it will work with any key listed in my_meths in the asset
> class or any of its parents.
> And yes, it looks like version is an option in Bric::Biz::Asset:
>
>  https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/Biz/Asset.pm#L548
>
> There might be other issues, though; I'm not sure. I don't
> think the URI will be recalculated in the database for every
> new version, so finding existing media in the system by
> their URIs might be tricky.

I've done some deeper digging and it turns out the filename can't even
part of the URI format for media.
If you stick %{slug} in there, it gets ignored.
The filename is appended outside of the _construct_uri function that
does all the %{...} replacements.

Any insight as to why this is the case? Looks like I'll have to do
quite a bit of hacking and I don't want to break anything.
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
On Dec 2, 2011, at 9:02 AM, Darren Embry wrote:

> I've done some deeper digging and it turns out the filename can't even
> part of the URI format for media.
> If you stick %{slug} in there, it gets ignored.

Yeah, media have no slug.

> The filename is appended outside of the _construct_uri function that
> does all the %{...} replacements.
>
> Any insight as to why this is the case? Looks like I'll have to do
> quite a bit of hacking and I don't want to break anything.

Well, it's a file name.

What if you use %{publish_date}? That should change every time it's published (as opposed to %{first_publish_date} and %{cover_date}).

Best,

David
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
On Fri, Dec 2, 2011 at 12:05, David E. Wheeler <david@justatheory.com> wrote:
>> The filename is appended outside of the _construct_uri function that
>> does all the %{...} replacements.
>>
>> Any insight as to why this is the case?  Looks like I'll have to do
>> quite a bit of hacking and I don't want to break anything.
>
> Well, it's a file name.
>
> What if you use %{publish_date}? That should change every time it's
> published (as opposed to %{first_publish_date} and %{cover_date}).

Not terribly fond of creating a new directory every time I publish something.
What I had in mind was something like this:
/images/%{categories}/filename-%{version}.jpg
Re: Use with Amazon CloudFront CDN (for images, PDFs, etc.) [ In reply to ]
On Dec 2, 2011, at 12:26 PM, Darren Embry wrote:

> Not terribly fond of creating a new directory every time I publish something.
> What I had in mind was something like this:
> /images/%{categories}/filename-%{version}.jpg

Oh, I see, you can’t modify the file name at all? I’d be interested in a patch that allowed that.

David