I tried using WMDumper to load the content of wikipedia in a Mysql 5
Database. I used tables.sql to generate the table. I then tried writing
the data in the mySql using WMDumper and get the following results.
C:\Downloads>set
class=mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
C:\Downloads>set data="C:\Downloads\enwiki-20070206-pages-articles.xml.bz2"
C:\Downloads>java -client -classpath
mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
org.mediawiki.dumper.Dumper
"--output=mysql://127.0.0.1/enwiki?user=xxxx&password=xxxxxxx"
"--format=sql:1.5" "C:\Downloads\enwiki-20070206-pages-a
rticles.xml.bz2"
1.000 pages (148,148/sec), 1.000 revs (148,148/sec)
2.000 pages (156,104/sec), 2.000 revs (156,104/sec)
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1
at java.lang.String.substring(Unknown Source)
at
com.mysql.jdbc.EscapeProcessor.escapeSQL(EscapeProcessor.java:151)
at com.mysql.jdbc.Statement.execute(Statement.java:845)
at org.mediawiki.importer.SqlServerStream.writeStatement(Unknown
Source)
at org.mediawiki.importer.SqlWriter.flushInsertBuffer(Unknown
Source)
at org.mediawiki.importer.SqlWriter.bufferInsertRow(Unknown Source)
at org.mediawiki.importer.SqlWriter15.writeRevision(Unknown Source)
at org.mediawiki.importer.MultiWriter.writeRevision(Unknown Source)
at org.mediawiki.importer.PageFilter.writeRevision(Unknown Source)
at org.mediawiki.dumper.ProgressFilter.writeRevision(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.closeRevision(Unknown
Source)
at org.mediawiki.importer.XmlDumpReader.endElement(Unknown Source)
at
org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
--
________________________________________________________________________
Axel Ngonga University of Leipzig, Dpt. Computer Sciences
M.Sc. Business Information Systems Group
http://bis.informatik.uni-leipzig.de
Johannisgasse 26, Room 5-22
D-04103 Leipzig
fon: +49-341-9732341 * fax: +49-341-9732239 * mobile: +49-176-23517631
________________________________________________________________________
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Database. I used tables.sql to generate the table. I then tried writing
the data in the mySql using WMDumper and get the following results.
C:\Downloads>set
class=mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
C:\Downloads>set data="C:\Downloads\enwiki-20070206-pages-articles.xml.bz2"
C:\Downloads>java -client -classpath
mwdumper.jar;mysql-connector-java-3.0.11-stable-bin.jar
org.mediawiki.dumper.Dumper
"--output=mysql://127.0.0.1/enwiki?user=xxxx&password=xxxxxxx"
"--format=sql:1.5" "C:\Downloads\enwiki-20070206-pages-a
rticles.xml.bz2"
1.000 pages (148,148/sec), 1.000 revs (148,148/sec)
2.000 pages (156,104/sec), 2.000 revs (156,104/sec)
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1
at java.lang.String.substring(Unknown Source)
at
com.mysql.jdbc.EscapeProcessor.escapeSQL(EscapeProcessor.java:151)
at com.mysql.jdbc.Statement.execute(Statement.java:845)
at org.mediawiki.importer.SqlServerStream.writeStatement(Unknown
Source)
at org.mediawiki.importer.SqlWriter.flushInsertBuffer(Unknown
Source)
at org.mediawiki.importer.SqlWriter.bufferInsertRow(Unknown Source)
at org.mediawiki.importer.SqlWriter15.writeRevision(Unknown Source)
at org.mediawiki.importer.MultiWriter.writeRevision(Unknown Source)
at org.mediawiki.importer.PageFilter.writeRevision(Unknown Source)
at org.mediawiki.dumper.ProgressFilter.writeRevision(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.closeRevision(Unknown
Source)
at org.mediawiki.importer.XmlDumpReader.endElement(Unknown Source)
at
org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at org.mediawiki.importer.XmlDumpReader.readDump(Unknown Source)
at org.mediawiki.dumper.Dumper.main(Unknown Source)
--
________________________________________________________________________
Axel Ngonga University of Leipzig, Dpt. Computer Sciences
M.Sc. Business Information Systems Group
http://bis.informatik.uni-leipzig.de
Johannisgasse 26, Room 5-22
D-04103 Leipzig
fon: +49-341-9732341 * fax: +49-341-9732239 * mobile: +49-176-23517631
________________________________________________________________________
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikitech-l