Mailing List Archive

Changes to maximum allowed time for RecentChanges and several other special pages
Hello,
As of today, all requests to several special pages (and their API
counterparts) are subject to a maximum database query execution time of 30
seconds. These special pages are: RecentChanges, Watchlist, Contributions,
and Log. This limit has already been in place for one third of all requests
accessing these page types since December 16th.

This threshold is based on a sampling of half a million requests done to
these special pages by users (excluding crawlers) in three largest wikis
(English Wikipedia, Wikidata and Wikimedia Commons). Out of 500,000
requests only 64 were above thirty seconds and 38 (out of that 64) were
above 60s. This meant that while the database returned the result of the
query in those cases, due to the 60 second limit of the webservice workers,
no results were shown to the user. Our logs show that we have around 1000
requests taking more than thirty seconds (in database query time) per day
so the ratio is extremely small (We get around 10 billion requests per day).

Putting things in context, these queries, while being only 0.01% of the
total number of requests to those special pages, were responsible for 2.9%
of the load. In a few scenarios where such queries appeared at a higher
volume, they were a contributing factor in outages that made one or more
Wikimedia projects unavailable. If left as-is the current configuration is
a potential DDoS vector (and likely was weaponized as one).

Even though ~1000 requests a day is not much, I understand that it is an
inconvenience and it might break useful workflows for some people (for
example patrollers). Query timeout doesn’t mean your request is bad, it
means our database schemas need improvements. We are aware of some of these
needed improvements and we are working on it but migrating terabytes of
data that needs to be replicated to twenty different servers while serving
millions of queries per second is not easy and will take time.

In the meantime, if this is breaking your workflow, please use
https://quarry.wmcloud.org or build a tool in Toolforge (it has access to
database replicas with a much higher timeout) to accommodate. For more
information, see:
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database

If you need advice on writing the queries. Feel free to create a ticket
with the “Data-Persistence (Consultation)” tag and we would be happy to
help.

For similar reasons the DynamicPageList extension has had a query limit of
10 seconds (due to its history of causing major issues
<https://phabricator.wikimedia.org/T287380>) since December 16.
Unfortunately, this doesn’t mean it’s safe to enable DPL on more wikis.

For more information, you can take a look at the ticket:
https://phabricator.wikimedia.org/T297708

Thank you
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>