Mailing List Archive

Wiki conversion script updates
The conversion script now includes old revisions of articles. I've tried
it on the Esperanto and Polish databases which I had handy, so far it
seems to work. Please test on other languages if you dare.

It shouldn't be too hard to modify this code to extract just the old
versions from the old English wikipedia and drop them into the current
database there, as well.

Notes:
* Since user accounts are not transferred, there is no numerical user ID
to put in the old_user field. Currently this results in the wiki
thinking the user name in old_user_text was an IP address and trying to
mask the last 3 digits, and not making links to user pages in the
history lists. The digit masking is definitely wrong, however not making
the links is arguably correct behavior.

* The most recent revision still has its user, comment, and timestamp
wiped and replaced with "conversion script", "automatic conversion", and
time of conversion. Would it not be nicer to keep the previous user,
comment, and timestamp, as is done with the older revisions?

* We might, however, still want to add a note that conversion took
place, so it's an obvious cutoff in the history list.

* Do we want to run fixLinks() on the old page versions? (This changes
/subpage links into Page:subpage links.) Right now I do so to preserve
link functionality, but this may not be appropriate, as it changes the
content of previous versions slightly. The purpose of keeping old
versions is to see what changed, so we might prefer to have the
unchanged (and no longer working) /subpage links. Comments?

-- brion vibber (brion @ pobox.com)
RE: Wiki conversion script updates [ In reply to ]
On mar, 2002-03-12 at 12:56, Magnus Manske wrote:
> Congratulations! You succeeded where I failed miserably.

Now, if I can only figure out how to read those occasional garbled
_current_ page files...

> Suggestion: Keep all old versions as they are, and generate an "artificial"
> edit, signed with "conversion script", where the sctual "subpage
> translation" takes place. So all old versions are stored correctly, and the
> conversion itself is more transparent.

Sounds reasonable. I'll try hacking that into place later today.

> Really, great work! :)

Thanks!

-- brion vibber (brion @ pobox.com)