Thank you so much for working on this, it was one of the most painful
aspects of core development!
It might be worth using a consistent gerrit topic or hashtag to make
finding the relevant patches easy.
On Sat, May 9, 2020 at 3:21 AM Amir Sarabadani <ladsgroup@gmail.com> wrote:
> Hello,
> In case you haven't done any changes on database schema of mediawiki core,
> let me explain the process to you (if you know this, feel free to skip this
> paragraph):
> * Mediawiki core supports three types of RDBMS: MySQL, Sqlite, Postgres. It
> used to be five (plus Oracle and MSSQL)
> * For each one of these types, you need to do three parts: 1- Change the
> tables.sql file so new installations get the new schema 2- Make .sql schema
> change file, like an "ALTER TABLE" for current installations so they can
> upgrade. 3- Wire that schema change file into *Updater.php file.
> * For example, this is a patch to drop a column:
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/473601 This file touches
> 14 different files, adds 94 lines and removes 30.
>
> This is bad for several reasons:
> * It is extremely complicated to do a even a simple schema change. Usually
> something as simple as adding an column takes a whole day for me. There are
> lots of complicating factors, like Sqlite doesn't have ALTER TABLE, so when
> you want to make a patch for adding a column, you need to make a temporary
> table with the new column, copy the old table data to it, drop the old
> table and then rename the old table.
> ** Imagine the pain and sorrow when you want to normalize a table meaning
> you need to do several schema changes: 1- Add a table, 2- Add a column on
> the old table, 3- make the column not-nullable when it's filled and make
> the old column nullable instead 4- drop the old column.
> * It's almost impossible to test all DBMS types, I don't have MSSQL or
> Oracle installed and I don't even know their differences with MySQL. I
> assume most other developers are good in one type, not all.
> * Writing raw sqls, specially duplicated ones, and doubly specially when we
> don't have CI to test (because we won't install propriety software in our
> infra) is pretty much prone to error. My favourite one was that a new
> column on a table was actually added to the wrong table in MSSQL and it
> went unnoticed for two years (four releases, including one LTS).
> * It's impossible to support more DBMS types through extensions or other
> third party systems. Because the maintainer needs to keep up with all
> patches we add to core and write their equivalents.
> * For lots of reasons, these schemas are diverging, there have been several
> work to just reduce this to a minimum.
>
> There was a RFC to introduce abstract schema and schema changes and it got
> accepted and I have been working to implement this:
> https://phabricator.wikimedia.org/T191231
>
> This is not a small task, and like any big work, it's important to cut it
> to small pieces and gradually improve things. So my plan is first, I
> abstract the schema (tables.sql files), then slowly I abstract schema
> changes. For now, the plan is to make these .sql files automatically
> generated through maintenance scripts. So we will have a file called
> tables.json and when running something like:
> php maintenance/generateSchemaSql.php --json maintenance/tables.json --sql
> maintenance/tables-generated.sql --type=mysql
> It would produce tables-generated.sql file. The code that produces it is
> Doctrine DBAL and this is already installed as a dev dependency of core
> because you would need Doctrine if you want to make a schema change, if you
> maintain an instance, you should not need anything. Most of the work for
> automatically generating schema is already merged and the last part that
> wires it (and migrates two tables) is up for review:
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
>
> My request is that I need to make lots of patches and since I'm doing this
> in my volunteer capacity, I need developers to review (and potentially help
> with the work if you're excited about this like me). Let me know if you're
> willing to be added in future patches and the current patch also welcomes
> any feedback: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
>
> I have added the documentation in
> https://www.mediawiki.org/wiki/Manual:Schema_changes for the plan and
> future changes. The ideal goal is that when you want to do a schema change,
> you just change tables.json and create a json file that is snapshot of
> before and after table (remember, sqlite doesn't have alter table, meaning
> it has to know the whole table). Also, once we are in a good shape in
> migrating mediawiki core, we can start cleaning up extensions.
>
> Any feedback is also welcome.
>
> Best
> --
> Amir (he/him)
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
aspects of core development!
It might be worth using a consistent gerrit topic or hashtag to make
finding the relevant patches easy.
On Sat, May 9, 2020 at 3:21 AM Amir Sarabadani <ladsgroup@gmail.com> wrote:
> Hello,
> In case you haven't done any changes on database schema of mediawiki core,
> let me explain the process to you (if you know this, feel free to skip this
> paragraph):
> * Mediawiki core supports three types of RDBMS: MySQL, Sqlite, Postgres. It
> used to be five (plus Oracle and MSSQL)
> * For each one of these types, you need to do three parts: 1- Change the
> tables.sql file so new installations get the new schema 2- Make .sql schema
> change file, like an "ALTER TABLE" for current installations so they can
> upgrade. 3- Wire that schema change file into *Updater.php file.
> * For example, this is a patch to drop a column:
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/473601 This file touches
> 14 different files, adds 94 lines and removes 30.
>
> This is bad for several reasons:
> * It is extremely complicated to do a even a simple schema change. Usually
> something as simple as adding an column takes a whole day for me. There are
> lots of complicating factors, like Sqlite doesn't have ALTER TABLE, so when
> you want to make a patch for adding a column, you need to make a temporary
> table with the new column, copy the old table data to it, drop the old
> table and then rename the old table.
> ** Imagine the pain and sorrow when you want to normalize a table meaning
> you need to do several schema changes: 1- Add a table, 2- Add a column on
> the old table, 3- make the column not-nullable when it's filled and make
> the old column nullable instead 4- drop the old column.
> * It's almost impossible to test all DBMS types, I don't have MSSQL or
> Oracle installed and I don't even know their differences with MySQL. I
> assume most other developers are good in one type, not all.
> * Writing raw sqls, specially duplicated ones, and doubly specially when we
> don't have CI to test (because we won't install propriety software in our
> infra) is pretty much prone to error. My favourite one was that a new
> column on a table was actually added to the wrong table in MSSQL and it
> went unnoticed for two years (four releases, including one LTS).
> * It's impossible to support more DBMS types through extensions or other
> third party systems. Because the maintainer needs to keep up with all
> patches we add to core and write their equivalents.
> * For lots of reasons, these schemas are diverging, there have been several
> work to just reduce this to a minimum.
>
> There was a RFC to introduce abstract schema and schema changes and it got
> accepted and I have been working to implement this:
> https://phabricator.wikimedia.org/T191231
>
> This is not a small task, and like any big work, it's important to cut it
> to small pieces and gradually improve things. So my plan is first, I
> abstract the schema (tables.sql files), then slowly I abstract schema
> changes. For now, the plan is to make these .sql files automatically
> generated through maintenance scripts. So we will have a file called
> tables.json and when running something like:
> php maintenance/generateSchemaSql.php --json maintenance/tables.json --sql
> maintenance/tables-generated.sql --type=mysql
> It would produce tables-generated.sql file. The code that produces it is
> Doctrine DBAL and this is already installed as a dev dependency of core
> because you would need Doctrine if you want to make a schema change, if you
> maintain an instance, you should not need anything. Most of the work for
> automatically generating schema is already merged and the last part that
> wires it (and migrates two tables) is up for review:
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
>
> My request is that I need to make lots of patches and since I'm doing this
> in my volunteer capacity, I need developers to review (and potentially help
> with the work if you're excited about this like me). Let me know if you're
> willing to be added in future patches and the current patch also welcomes
> any feedback: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/595240
>
> I have added the documentation in
> https://www.mediawiki.org/wiki/Manual:Schema_changes for the plan and
> future changes. The ideal goal is that when you want to do a schema change,
> you just change tables.json and create a json file that is snapshot of
> before and after table (remember, sqlite doesn't have alter table, meaning
> it has to know the whole table). Also, once we are in a good shape in
> migrating mediawiki core, we can start cleaning up extensions.
>
> Any feedback is also welcome.
>
> Best
> --
> Amir (he/him)
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l