Mailing List Archive

[Cargo] table format for _pageData and _fileData
Hi,

for our production system I’m using a MariaDB Galera cluster as RDMS backend. Though there is a feature to enable (experimental) replication of MyISAM tables, this doesn't play well with Cargo. Certain operations involving _pageData tables caused the cluster to reach an inconsistent data state, thus stopping replication and ultimately falling apart.

I could isolate the cause to the following transaction (which is embedded between two inserts for the same thing):

BEGIN /* Wikimedia\Rdbms\Database::begin */
SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE 'cargo\_\_staff\_\_NEXT'
SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE 'cargo\_\_staff\_\_\_files'
DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo__staff` WHERE `_pageID` = '273'
DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo___pageData` WHERE `_pageID` = '273'
COMMIT /* Wikimedia\Rdbms\Database::commit */

cargo__staff table being InnoDB, cargo___pageData being MyISAM. Unfortunately this leads to the delete statement on the InnoDB table not being replicated, while the delete on the MyISAM table is. Thus on the following insert the row already exists, causing a unique key violation and inconsistent cluster state.

Is it really still necessary to use MyISAM for these tables? Full text indices are available on InnoDB for quite some time now, so I’m wondering whether this still needs to be supported or if it would be possible to make both choices available.


Best: Jan.


--
idea-sketch

Jan Böhme & Uwe Schützenmeister
Lößnitzstr. 14
01097 Dresden

www.idea-sketch.com <http://www.idea-sketch.com/>

Tel.: +49 . (0)351 . 810 74 250
Mobil: +49 . (0)179 .53 41 641
Re: [Cargo] table format for _pageData and _fileData [ In reply to ]
Hi Jan,

Oh, that's too bad. I didn't know that MyISAM doesn't support database
clustering. (You might be the first person to run Cargo on a clustered
database - it's good to know that it seems to be working, other than this
one problem.)

You're right that the MyISAM requirement is only for old database versions
- the last version of MySQL that had this problem was 5.6, and that version
came out in 2013, and will stop being supported in a little over a year. I
could have added a new setting like
$wgCargoUseMyISAMForTablesWithSearchtextFields, but that seemed like
overkill, so I decided instead to just remove this code that forces the use
of MyISAM. Hopefully there aren't too many people who (a) are using Cargo
with MySQL <= 5.6 (or its MariaDB equivalent), (b) use InnoDB by default,
(c) use the _pageData or _fileData tables, and (d) will recreate these
tables in the future. If there are, they'll have to either manually re-add
this code in, or update to a more recent DB version.

Anyway, if you get the latest version of the Cargo code, and recreate these
tables, the problems will hopefully be gone.

-Yaron

On Fri, Nov 8, 2019 at 6:52 AM Jan Böhme <jan@idea-sketch.com> wrote:

> Hi,
>
> for our production system I’m using a MariaDB Galera cluster as RDMS
> backend. Though there is a feature to enable (experimental) replication of
> MyISAM tables, this doesn't play well with Cargo. Certain operations
> involving _pageData tables caused the cluster to reach an inconsistent data
> state, thus stopping replication and ultimately falling apart.
>
> I could isolate the cause to the following transaction (which is embedded
> between two inserts for the same thing):
>
> BEGIN /* Wikimedia\Rdbms\Database::begin */
> SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE
> 'cargo\_\_staff\_\_NEXT'
> SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE
> 'cargo\_\_staff\_\_\_files'
> DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo__staff` WHERE
> `_pageID` = '273'
> DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo___pageData`
> WHERE `_pageID` = '273'
> COMMIT /* Wikimedia\Rdbms\Database::commit */
>
> cargo__staff table being InnoDB, cargo___pageData being MyISAM.
> Unfortunately this leads to the delete statement on the InnoDB table not
> being replicated, while the delete on the MyISAM table is. Thus on the
> following insert the row already exists, causing a unique key violation and
> inconsistent cluster state.
>
> Is it really still necessary to use MyISAM for these tables? Full text
> indices are available on InnoDB for quite some time now, so I’m wondering
> whether this still needs to be supported or if it would be possible to make
> both choices available.
>
>
> Best: Jan.
>
>
> --
> idea-sketch
>
> Jan Böhme & Uwe Schützenmeister
> Lößnitzstr. 14
> 01097 Dresden
>
> www.idea-sketch.com <http://www.idea-sketch.com/>
>
> Tel.: +49 . (0)351 . 810 74 250
> Mobil: +49 . (0)179 .53 41 641
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>


--
WikiWorks · MediaWiki Consulting · http://wikiworks.com
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
Re: [Cargo] table format for _pageData and _fileData [ In reply to ]
Hi Yaron,

Thanks a lot for your fast response and the code modification, this saves us some trouble.

Apart from this using Cargo in a clustered environment seems to work fine.
We’re just starting out with it, obviously, having rewritten a formerly SMW-based application, but I’m confident, it’s looking good so far.

Thank you again: Jan.


> Am 08.11.2019 um 17:00 schrieb Yaron Koren <yaron@wikiworks.com>:
>
> Hi Jan,
>
> Oh, that's too bad. I didn't know that MyISAM doesn't support database
> clustering. (You might be the first person to run Cargo on a clustered
> database - it's good to know that it seems to be working, other than this
> one problem.)
>
> You're right that the MyISAM requirement is only for old database versions
> - the last version of MySQL that had this problem was 5.6, and that version
> came out in 2013, and will stop being supported in a little over a year. I
> could have added a new setting like
> $wgCargoUseMyISAMForTablesWithSearchtextFields, but that seemed like
> overkill, so I decided instead to just remove this code that forces the use
> of MyISAM. Hopefully there aren't too many people who (a) are using Cargo
> with MySQL <= 5.6 (or its MariaDB equivalent), (b) use InnoDB by default,
> (c) use the _pageData or _fileData tables, and (d) will recreate these
> tables in the future. If there are, they'll have to either manually re-add
> this code in, or update to a more recent DB version.
>
> Anyway, if you get the latest version of the Cargo code, and recreate these
> tables, the problems will hopefully be gone.
>
> -Yaron
>
> On Fri, Nov 8, 2019 at 6:52 AM Jan Böhme <jan@idea-sketch.com <mailto:jan@idea-sketch.com>> wrote:
>
>> Hi,
>>
>> for our production system I’m using a MariaDB Galera cluster as RDMS
>> backend. Though there is a feature to enable (experimental) replication of
>> MyISAM tables, this doesn't play well with Cargo. Certain operations
>> involving _pageData tables caused the cluster to reach an inconsistent data
>> state, thus stopping replication and ultimately falling apart.
>>
>> I could isolate the cause to the following transaction (which is embedded
>> between two inserts for the same thing):
>>
>> BEGIN /* Wikimedia\Rdbms\Database::begin */
>> SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE
>> 'cargo\_\_staff\_\_NEXT'
>> SHOW /* Wikimedia\Rdbms\DatabaseMysqlBase::tableExists */ TABLES LIKE
>> 'cargo\_\_staff\_\_\_files'
>> DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo__staff` WHERE
>> `_pageID` = '273'
>> DELETE /* Wikimedia\Rdbms\Database::delete */ FROM `cargo___pageData`
>> WHERE `_pageID` = '273'
>> COMMIT /* Wikimedia\Rdbms\Database::commit */
>>
>> cargo__staff table being InnoDB, cargo___pageData being MyISAM.
>> Unfortunately this leads to the delete statement on the InnoDB table not
>> being replicated, while the delete on the MyISAM table is. Thus on the
>> following insert the row already exists, causing a unique key violation and
>> inconsistent cluster state.
>>
>> Is it really still necessary to use MyISAM for these tables? Full text
>> indices are available on InnoDB for quite some time now, so I’m wondering
>> whether this still needs to be supported or if it would be possible to make
>> both choices available.
>>
>>
>> Best: Jan.
>>
>>
>> --
>> idea-sketch
>>
>> Jan Böhme & Uwe Schützenmeister
>> Lößnitzstr. 14
>> 01097 Dresden
>>
>> www.idea-sketch.com <http://www.idea-sketch.com/> <http://www.idea-sketch.com/ <http://www.idea-sketch.com/>>
>>
>> Tel.: +49 . (0)351 . 810 74 250
>> Mobil: +49 . (0)179 .53 41 641
>>
>> _______________________________________________
>> MediaWiki-l mailing list
>> To unsubscribe, go to:
>> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l <https://lists.wikimedia.org/mailman/listinfo/mediawiki-l>
>>
>
>
> --
> WikiWorks · MediaWiki Consulting · http://wikiworks.com <http://wikiworks.com/>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l <https://lists.wikimedia.org/mailman/listinfo/mediawiki-l>
--
idea-sketch

Jan Böhme & Uwe Schützenmeister
Lößnitzstr. 14
01097 Dresden

www.idea-sketch.com <http://www.idea-sketch.com/>

Tel.: +49 . (0)351 . 810 74 250
Mobil: +49 . (0)179 .53 41 641