Mailing List Archive

New query builders and IConnectionProvider
Hello,

As part of ongoing refactors and improvements to the Rdbms component in
MediaWiki, we have made some changes that might affect your work. Read
the February
announcement
<https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/YNLVJVTYTK3IGQF4HY7ITGMIJ2W5Q7CG/>
for previous changes in this area.
New DeleteQueryBuilder and UnionQueryBuilder

We introduced the chainable SelectQueryBuilder
<https://www.mediawiki.org/wiki/Manual:Database_access#SelectQueryBuilder>
in 2020 to create SELECT queries. This "fluent" API style allows each
segment of the query to be set through a discoverable method name with rich
documentation and parameter types. Last February we introduced
UpdateQueryBuilder in the same vein.

The new DeleteQueryBuilder similarly supersedes use of IDatabase::delete()
for building DELETE operations. Check change 913646
<https://gerrit.wikimedia.org/r/c/mediawiki/core/+/913646/1/includes/user/UserOptionsManager.php#417>
for how this is used in practice.

Union queries are fairly rare in our codebases. Previously, one would have
to bypass the query builders (by calling the underlying SQL text formatters
like Database::selectSqlText) and pass several raw SQL strings through
Database::unionQueries and then to Database::query. As you can see, this is
not optimal. The new UnionQueryBuilder enables you to combine several
SelectQueryBuilder objects and then use the familiar fetchResultSet()
method on the resulting UnionQueryBuilder object. You can see an example of
this in change 906101
<https://gerrit.wikimedia.org/r/c/mediawiki/core/+/906101/5/maintenance/findOrphanedFiles.php>
.
IConnectionProvider typehint

In February, we introduced the LBFactory->getPrimaryConnection() and
LBFactory->getReplicaConnection() methods to improve ergonomics around
getting database connections, and the IReadableDatabase typehint for
replica connections.

We're now introducing the IConnectionProvider typehint as a stable
interface holding the above methods without the rest of LBFactory. Meaning,
if you all you need is a connection, you can typehint to
IConnectionProvider. This is an extremely narrow interface (four public
methods!) compared to LBFactory (42 public methods). This will reduce
coupling and should ease testing as well. We already adopted
IConnectionProvider in several MediaWiki core components, which made a
notable dent in MediaWiki's PhpMetrics complexity score
<https://doc.wikimedia.org/mediawiki-core/master/phpmetrics/complexity.html>
.

This backwards-compatible change can be adopted through only a change in
typehint. MediaWiki’s LBFactory service implements IConnectionProvider.
Dependency injection remains unchanged.

You can see an example in change 913649
<https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DiscussionTools/+/913649/2/includes/SubscriptionStore.php>.
We recommend changing variable names like $lbFactory to $dbProvider for
consistency in usage.

Future changes

We will continue transforming MediaWiki core to adopt query builders,
IConnectionProvider, and IReadableDatabase. And, we'll continue helping
various core components and extensions to phase out their direct calls to
LoadBalancer::getConnection(). This should improve code health,
readability, testability, and overall coupling scores in MediaWiki.

Our next announcement will introduce InsertQueryBuilder, and we're
considering a query builder for "UPSERT" queries as well. Lastly, we are
planning to discourage remaining use of raw SQL in "where" conditions by
introducing an expression builder (T210206
<https://phabricator.wikimedia.org/T210206>).
You can help

Nearly every feature of MediaWiki requires retrieving or storing
information from a database at one point or another. Reworking these is a
long tail of effort. Any help in adopting query builders, and updating code
away from ILoadBalancer and ILBFactory would be really appreciated.
Together we can improve readability and code health (now with less
footguns!).

Until the next update,

--

Amir Sarabadani (he/him), Staff Database Architect

Timo Tijhof, Principal Performance Engineer