Mailing List Archive

New mediawiki.page_change.v1 event stream publicly available
The Wikimedia Data Engineering team is pleased to announce that a new event
stream, mediawiki.page_change.v1, is now publicly available at
stream.wikimedia.org (here
<https://stream.wikimedia.org/v2/ui/#/?streams=mediawiki.page_change.v1>).

The new event stream models page changes using a consolidated changelog
data model, whereas existing streams, like page-create and revision-create,
model each type of page change as a separate stream. With the current
model, you would have to consume multiple streams to understand how a
MediaWiki page changed. With the new stream, when a page is created,
edited, or deleted, an event captures the new state, as well as the state
prior to the change.

For more information on what fields you can expect to see in the events,
see the schema definition here
<https://schema.wikimedia.org/repositories//primary/jsonschema/mediawiki/page/change/latest.yaml>
.

Starting now, new streams will be suffixed with a major version and will
not use hyphens in stream names. For more information, see here
<https://wikitech.wikimedia.org/wiki/Event_Platform/Stream_Configuration#Stream_versioning>
.

Benefits:

-

Only one stream to consume (instead of having to consume page-create,
page-delete, and revision-create streams to have a full picture of page
changes)
-

Events are ordered for a given page_id (delete will not come before
create)
-

Latest event has current and prior state

The existing event streams, such as page-create, revision-create,
page-delete, etc will continue to remain available, for now. We encourage
people to migrate to the new consolidated stream when they can as we plan
to mark the existing streams as deprecated within the next year. We will
send another communication about this when the plan is decided.

If you have any questions or issues please drop a Phabricator ticket here
<https://phabricator.wikimedia.org/project/board/6628/>

--

Luke Bowmaker

Data Engineering - Product Manager