Mailing List Archive

Caching thoughts
I probably won't get to any of this for a while as I've promised to fix
up that conversion script, but some thoughts...

Client-side caching of page-views and and recentchanges are allowed
(currently only for IE 5.5+ as Mozilla and IE 5.0 have had difficulties
refreshing and I haven't had chance to test with other browsers).

The reported last-modified date is the edit date of the article revision
(or for Recentchanges, the last modified article). On load, the browser
sends this back to us in an if-modified-since header; if it's identical,
we tell it to use its cached version, else we send out the new time and
continue (see OutputPage::checkLastModified() ). We force this check on
every attempt to load the page by pre-expiring it and sending a
'must-revalidate' cache-control header (see OutputPage::output() ); this
requires a round-trip to the server, but we avoid most of the heavy
lifting by doing the time check quick and getting out before querying a
hundred links and parsing whole pages of gunk.

(Recentchanges changes so often there's probably no point to caching it
though...)

Some problems: this does not take into account secondary changes: linked
articles are created and deleted, users log in and out or change their
display preferences. This can lead to incorrect display, which isn't
always cleared by a refresh -- by default, refresh in IE sends the
if-last-modified and obeys the 304 Not Modified. Users don't generally
know about the Ctrl+Refresh force reload trick.

Links might be solvable by adding a "last_touched" timestamp field to
cur; this would be set to edit timestamp on save, and whenever a page is
created or deleted, all pages linking to it would have their
last_touched reset to the current time. We then use this as our basis
for cache comparisons, forcing reloads of pages whose links should show
different. (And if changes to the software are made that change
rendering, we can invalidate caches for the whole wiki with a sweeping
UPDATE statement.)

Things like user settings and logging in and out might be solvable with
session variables; an invalidating event would set a sort of 'cache from
here on' time, and any earlier last mod would get pushed up to there.

Other problems: I added the caching support in the first place because
with the pages marked as uncacheable IE was forcing reloads on every
pass through the 'back' and 'forwards' buttons. Usually this is simply
an inconvenience -- more time spent waiting for the server, and more
load on the server increasing the time spent waiting. When editing,
however, this can cause data loss; some people will click the links from
the sidebar to get additional information, then hit 'back' to return to
their editing session. Oops! The page was reloaded. Is it safe to mark
the edit pages as cacheable? What do we have to do to them to mark them
properly?

-- brion vibber (brion @ pobox.com)