Mailing List Archive

Changes in schema of pagelinks tables
(If you don’t work with pagelinks table, feel free to ignore this message)

Hello,

Here is an update and reminder on the previous announcement
<https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
regarding normalization of links tables that was sent around a year ago.

As part of that work, soon the pl_namespace and pl_title columns of
pagelinks table will be dropped and you will need to use pl_target_id
joining with the linktarget table instead. This is basically identical to
the templatelinks normalization that happened a year ago.

Currently, MediaWiki writes to both data schemes of pagelinks for new rows
in all wikis except English Wikipedia and Wikimedia Commons (we will start
writing to these two wikis next week). We have started to backfill the data
with the new schema but it will take weeks to finish in large wikis.

So if you query this table directly or your tools do, You will need to
update them accordingly. I will write a reminder before dropping the old
columns once the data has been fully backfilled.

You can keep track of the general long-term work in T300222
<https://phabricator.wikimedia.org/T300222> and the specific work for
pagelinks in T299947 <https://phabricator.wikimedia.org/T299947>. You can
also read more on the reasoning in T222224
<https://phabricator.wikimedia.org/T222224> or the previous announcement
<https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
.

Thank you,
--
*Amir Sarabadani (he/him)*
Staff Database Architect
Wikimedia Foundation <https://wikimediafoundation.org/>
Re: Changes in schema of pagelinks tables [ In reply to ]
Hello,
The data migration for several sections have been completed. We will start
dropping the old columns on s4
<https://noc.wikimedia.org/conf/dblists/s4.dblist> and s5
<https://noc.wikimedia.org/conf/dblists/s5.dblist> this and next week and
right after will start dropping the old columns on s3
<https://noc.wikimedia.org/conf/dblists/s3.dblist>.

The data has not been fully migrated in s1 (enwiki) and s8 (wikidata) but
migrated in some wikis of the rest of the sections.

If your tools rely on pagelinks, you might need to update your tools now.

Best

Am Mi., 18. Okt. 2023 um 13:46 Uhr schrieb Amir Sarabadani <
asarabadani@wikimedia.org>:

> (If you don’t work with pagelinks table, feel free to ignore this message)
>
> Hello,
>
> Here is an update and reminder on the previous announcement
> <https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
> regarding normalization of links tables that was sent around a year ago.
>
> As part of that work, soon the pl_namespace and pl_title columns of
> pagelinks table will be dropped and you will need to use pl_target_id
> joining with the linktarget table instead. This is basically identical to
> the templatelinks normalization that happened a year ago.
>
> Currently, MediaWiki writes to both data schemes of pagelinks for new rows
> in all wikis except English Wikipedia and Wikimedia Commons (we will start
> writing to these two wikis next week). We have started to backfill the data
> with the new schema but it will take weeks to finish in large wikis.
>
> So if you query this table directly or your tools do, You will need to
> update them accordingly. I will write a reminder before dropping the old
> columns once the data has been fully backfilled.
>
> You can keep track of the general long-term work in T300222
> <https://phabricator.wikimedia.org/T300222> and the specific work for
> pagelinks in T299947 <https://phabricator.wikimedia.org/T299947>. You can
> also read more on the reasoning in T222224
> <https://phabricator.wikimedia.org/T222224> or the previous announcement
> <https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
> .
>
> Thank you,
> --
> *Amir Sarabadani (he/him)*
> Staff Database Architect
> Wikimedia Foundation <https://wikimediafoundation.org/>
>
Re: Changes in schema of pagelinks tables [ In reply to ]
Hi,
It's me again. The data have been migrated in all wikis except Turkish
Wikipedia and Chinese Wikipedia (last two wikis of s2) and these two wikis
will be also migrated in a week or so. So if you rely on the old columns of
pagelinks, you need to migrate them to use the normalized schema ASAP.

We will start dropping the old columns gradually in two weeks.

Best

Am Di., 16. Jan. 2024 um 11:56 Uhr schrieb Amir Sarabadani <
asarabadani@wikimedia.org>:

> Hello,
> The data migration for several sections have been completed. We will start
> dropping the old columns on s4
> <https://noc.wikimedia.org/conf/dblists/s4.dblist> and s5
> <https://noc.wikimedia.org/conf/dblists/s5.dblist> this and next week and
> right after will start dropping the old columns on s3
> <https://noc.wikimedia.org/conf/dblists/s3.dblist>.
>
> The data has not been fully migrated in s1 (enwiki) and s8 (wikidata) but
> migrated in some wikis of the rest of the sections.
>
> If your tools rely on pagelinks, you might need to update your tools now.
>
> Best
>
> Am Mi., 18. Okt. 2023 um 13:46 Uhr schrieb Amir Sarabadani <
> asarabadani@wikimedia.org>:
>
>> (If you don’t work with pagelinks table, feel free to ignore this message)
>>
>> Hello,
>>
>> Here is an update and reminder on the previous announcement
>> <https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
>> regarding normalization of links tables that was sent around a year ago.
>>
>> As part of that work, soon the pl_namespace and pl_title columns of
>> pagelinks table will be dropped and you will need to use pl_target_id
>> joining with the linktarget table instead. This is basically identical to
>> the templatelinks normalization that happened a year ago.
>>
>> Currently, MediaWiki writes to both data schemes of pagelinks for new
>> rows in all wikis except English Wikipedia and Wikimedia Commons (we will
>> start writing to these two wikis next week). We have started to backfill
>> the data with the new schema but it will take weeks to finish in large
>> wikis.
>>
>> So if you query this table directly or your tools do, You will need to
>> update them accordingly. I will write a reminder before dropping the old
>> columns once the data has been fully backfilled.
>>
>> You can keep track of the general long-term work in T300222
>> <https://phabricator.wikimedia.org/T300222> and the specific work for
>> pagelinks in T299947 <https://phabricator.wikimedia.org/T299947>. You
>> can also read more on the reasoning in T222224
>> <https://phabricator.wikimedia.org/T222224> or the previous announcement
>> <https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/>
>> .
>>
>> Thank you,
>> --
>> *Amir Sarabadani (he/him)*
>> Staff Database Architect
>> Wikimedia Foundation <https://wikimediafoundation.org/>
>>
>