Mailing List Archive

Documentation: import of database dump, mod_rewrite rules
Hi,

first of all, there's some progress in my attempt to extend and rewrite
the documentation for MediaWiki (see
http://meta.wikipedia.org/wiki/Documentation); there were some
discrepancies with Mav which we have worked out, as I hope.

Currently I'm structuring the sections, collecting text and testing some
instructions. Doing this, I ran into some problems.

(1) Importing a database dump - this applies to section
[http://meta.wikipedia.org/wiki/Documentation:Administration#Getting_data:_Importing_a_database_dump].

(1a) This applies toWhen importing the database dumps cur_ and old_ of
the German Wikipedia from 20031125 (bzip2 -dc cur_* | mysql -u wikiadmin
-padminpass wikidb, then calling php4 rebuildlinks.php), there appeared
failures when running rebuildlinks.php. The script outputs something like

Rebuilding link tables (pass 1).
1000 of 72120 articles scanned.
2000 of 72120 articles scanned.
...
46000 of 72120 articles scanned.
47000 of 72120 articles scanned.

and then gives an error message:

"Error in the database. There was an syntax error in the database
query. The last database query was: "INSERT INTO rebuildlinks
(rl_f_id,rl_f_title,rl_to) VALUES " from the fuction "".
MySQL reported error "1064: You have an error in your SQL syntax."
[...]"

(This is roughly translated from German; the original says "Fehler in
der Datenbank [...] Es gab einen Syntaxfehler in der Datenbankabfrage.
Die letzte Datenbankabfrage lautete: "INSERT INTO rebuildlinks
(rl_f_id,rl_f_title,rl_to) VALUES " aus der Funktion "". MySQL meldete
den Fehler: "1064: You have an error in your SQL syntax. Check the
manual that corresponds to your MySQL server version for the right
syntax to use near '' at line 1".")

The md5 checksums of old_ and cur_ matched those given on the website.

What is recommended to deal with an error like this? Wait a few days and
download a newer dump? Should the wikidb be dropped or cleared before
importing another dump?

(1b) When just importing cur_, rebuildlinks.php runs fine, but when
accessing the Main_Page ("Hauptseite"), it says that there are "0
articles", but the wiki pages exist, in this case, several thousend
pages. Is this the wanted behaviour, or is there something else to do
after running rebuildlinks.php? Also, newly created pages are not being
counted.

(3) Rewrite Engine - this applies to section
[http://meta.wikipedia.org/wiki/Documentation:Configuration].

The statement references to an external RewriteMap which I couldn't find:

RewriteMap ampescape int:ampescape
RewriteRule ^/wiki/(.*)$ /wiki.phtml?title=${ampescape:$1} [L]

The ../maintenance directory contains just apache-ampersand.diff. Is the
RewriteMap used by Wikipedia somewhere publicly available?

Thanks & greetings, -asb
Re: Documentation: import of database dump, mod_rewrite rules [ In reply to ]
On Nov 30, 2003, at 15:23, Agon S. Buchholz wrote:
> "Error in the database. There was an syntax error in the database
> query. The last database query was: "INSERT INTO rebuildlinks
> (rl_f_id,rl_f_title,rl_to) VALUES " from the fuction "".
> MySQL reported error "1064: You have an error in your SQL syntax."

The new rebuildlinks script is heavily rewritten; you may wish to grab
the current dev branch out of CVS (which we're running now on
Wikipedia) or wait a few days for the new stable release.

> (1b) When just importing cur_, rebuildlinks.php runs fine, but when
> accessing the Main_Page ("Hauptseite"), it says that there are "0
> articles", but the wiki pages exist, in this case, several thousend
> pages. Is this the wanted behaviour, or is there something else to do
> after running rebuildlinks.php? Also, newly created pages are not
> being counted.

To rebuild the article count manually:
SELECT @foo:=COUNT(*) FROM cur
WHERE cur_namespace=0 AND cur_is_redirect=0 AND cur_text like '%[[%';
UPDATE site_stats SET ss_good_articles=@foo;


> The statement references to an external RewriteMap which I couldn't
> find:
>
> RewriteMap ampescape int:ampescape
> RewriteRule ^/wiki/(.*)$ /wiki.phtml?title=${ampescape:$1} [L]
>
> The ../maintenance directory contains just apache-ampersand.diff. Is
> the RewriteMap used by Wikipedia somewhere publicly available?

That is a patch for Apache 1.3.x which creates a new internal
rewritemap function called ampescape. The one and only thing it does is
to escape the '&' character into '%26' so that titles including the
ampersand won't be corrupted when rewritten (since & is a reserved
character in query strings, as a field separator).

-- brion vibber (brion @ pobox.com)