Mailing List Archive

[Breaking Change] Two changes regarding searching for proto-relative URLs in external links
Hello,

As part of work to make storage of external links in MediaWiki continue to
scale without risking site stability (T312666
<https://phabricator.wikimedia.org/T312666>), we are deprecating most of
the special functionalities around proto-relative URLs (URLs that start
with // instead of https:// or http://).

Proto-relative URLs were beneficial a decade ago, when Wikimedia projects
were being served in both encrypted and unencrypted traffic (http and
https). However, since 2015, all of our traffic has been served encrypted
only, and this functionality doesn’t provide much user benefit any more for
Wikimedia wikis. With HTTP/2, a similar circumstance applies to all
MediaWiki users.

As well as being low-value, our external links storage (the externallinks
table) has grown to be one of the biggest tables for each production wiki.
This is due to many duplications of URL information, added to serve
different use cases. With the changes, we are removing these duplications,
and some of the functionality. You can read more about the work in T312666
<https://phabricator.wikimedia.org/T312666>.
Storage of proto-relative URLs has changed to only store HTTPS URLs

Previously, if a proto-relative was added in an edit, MediaWiki internally
treated it as two links one with http:// and one with https://. From this
week forward, for all Wikimedia wikis, the storage will change to store
only https:// URLs. Once those wikis are switched to read the new database
schema, the links will be presented as https only in Special:LinkSearch and
their API counter-parts. This means effectively a proto-relative external
link will be treated like a HTTPS one. This change will also apply to
non-Wikimedia wikis using MediaWiki 1.41+.
expandurl option is deprecated and ignored in the exturlusage and extlinks
MediaWiki action API modules

This means “expandurl” argument in exturlusage and extlinks
<https://www.mediawiki.org/wiki/API:Extlinks> API modules will be ignored
and proto-relative URLs will be always expanded to HTTPS. This will happen
any time a wiki is switched to read from the new externallinks fields. (You
can track the progress in T335343
<https://phabricator.wikimedia.org/T335343>) This change will also apply to
non-Wikimedia wikis using MediaWiki 1.41+.

If your wiki heavily uses proto-relative URLs in articles' wikitext, we
recommend changing them to https instead which also improves storage as
every proto-relative URLs takes up two rows.

Thank you,

--

Amir Sarabadani, Staff Database Architect

James Forrester, Staff Software Engineer

Timo Tijhof, Principal Performance Engineer