Mailing List Archive

converting from latin1 to unicode?
Hello,

I'm running a German wiki configured using latin1. Trying to use unicode
instead of latin1, all special characters like ÄÖÜ etc. are broken in the
browser (and also in the database, if I take a look into the tables via
phpmyadmin). This effect can be undone by running the setup again
configuring latin1. Not being a specialist in such things it seems to me to
be more "modern" to use unicode. Is this correct, or should I just leave
latin1. If it would make sense to convert to unicode, how can I do this?
What are the consequences of using the one or the other?

Sorry for this questions of one mainly just beeing a "user" but who also
does some things on a server. Any hint, even in the form of hyperlinks to
further reading would be appreciated.

Nikolai
Re: converting from latin1 to unicode? [ In reply to ]
Am Mon, 11 Oct 2004 15:11:31 +0200 hat Nikolai Neumayer
<neum@ai.wu-wien.ac.at> geschrieben:

> Hello,
>
> I'm running a German wiki configured using latin1. Trying to use unicode
> instead of latin1, all special characters like ÄÖÜ etc. are broken in the
> browser (and also in the database, if I take a look into the tables via
> phpmyadmin). This effect can be undone by running the setup again
> configuring latin1. Not being a specialist in such things it seems to me
> to

You just make a dump of your database like

$ man mysqldump
$ mysqldump -u user -p --add-drop-tables database
$ cp database.sql database.sql.backup

Then run that database.sql file through a converter. I use recode and it's
pretty powerful.
If you use debian it's just an

$ apt-get install recode

$ man recode
$ recode -l
$ recode l1..u8 database.sql

This should be fine normaly.
Then load it back into the database.

$ man mysql
$ mysql -u user -p database < database.sql

In any case make sure you read the according manpages before you use the
commands to get the syntax right.
I often make mistakes :-)

Of course you can use phpmyadmin for the dumping and loading but I think
it's a bit long-winded for this task.

Unix is cool :-)

--tic

PS: Greetings to the city I was born in :-)

> be more "modern" to use unicode. Is this correct, or should I just leave
> latin1. If it would make sense to convert to unicode, how can I do this?
> What are the consequences of using the one or the other?
>
> Sorry for this questions of one mainly just beeing a "user" but who also
> does some things on a server. Any hint, even in the form of hyperlinks to
> further reading would be appreciated.
>
> Nikolai
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
AW: converting from latin1 to unicode? [ In reply to ]
Thank you, tic, my wiki now looks perfect in the browser, even though I'm
using unicode now!

My way to nearly perfect satisfaction:
I made the dump on a unix-machine, moved the .sql-file to my w2k-pc and used
recode to convert into unicode. After conversion, the .sql-file showed
broken special-characters in my windows-notepad - I ignored this and loaded
the dump into the database. After running the mediawiki-config again,
setting Language to "de - Deutsch - Unicode", the wiki looks like it should.

Still a bit confused:
What I do not understand is the fact, phpmyadmin shows broken
special-characters in the database. --> This seams to me, like there are (at
least) to views to the database - one using the mediawiki-script and another
one using the phpmyadmin-script.

Perfect confusion:
Strange enough (see above), I'm running another German wiki on the same
unix-machine under the same user, using another database. This second wiki
is also set to use "de - Deutsch - Unicode" via mediawiki-config. There are
no problems, browsing through wiki-pages with special-characters - and also
phpmyadmin shows pretty Äs,Ös etc. in the database.

Conclusion:
I'm happy, that the front-end for my wiki-users looks perfect and therefore
thankful for the tics help.
But I'm also curious, why things in the database look, like they look.
Maybe, I have a problem in one of the databases, which mediawiki compensates
at present, but this could cause troubles in the future - my belly says...

Niki

-----Ursprüngliche Nachricht-----
Von: mediawiki-l-bounces@Wikimedia.org
[mailto:mediawiki-l-bounces@Wikimedia.org] Im Auftrag von tic@tictric.net
Gesendet: Dienstag, 12. Oktober 2004 00:05
An: MediaWiki announcements and site admin list
Betreff: Re: [Mediawiki-l] converting from latin1 to unicode?


Am Mon, 11 Oct 2004 15:11:31 +0200 hat Nikolai Neumayer
<neum@ai.wu-wien.ac.at> geschrieben:

> Hello,
>
> I'm running a German wiki configured using latin1. Trying to use
> unicode instead of latin1, all special characters like ÄÖÜ etc. are
> broken in the browser (and also in the database, if I take a look into
> the tables via phpmyadmin). This effect can be undone by running the
> setup again configuring latin1. Not being a specialist in such things it
seems to me
> to

You just make a dump of your database like

$ man mysqldump
$ mysqldump -u user -p --add-drop-tables database
$ cp database.sql database.sql.backup

Then run that database.sql file through a converter. I use recode and it's
pretty powerful.
If you use debian it's just an

$ apt-get install recode

$ man recode
$ recode -l
$ recode l1..u8 database.sql

This should be fine normaly.
Then load it back into the database.

$ man mysql
$ mysql -u user -p database < database.sql

In any case make sure you read the according manpages before you use the
commands to get the syntax right.
I often make mistakes :-)

Of course you can use phpmyadmin for the dumping and loading but I think
it's a bit long-winded for this task.

Unix is cool :-)

--tic

PS: Greetings to the city I was born in :-)

> be more "modern" to use unicode. Is this correct, or should I just
> leave latin1. If it would make sense to convert to unicode, how can I
> do this? What are the consequences of using the one or the other?
>
> Sorry for this questions of one mainly just beeing a "user" but who
> also does some things on a server. Any hint, even in the form of
> hyperlinks to further reading would be appreciated.
>
> Nikolai
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l


_______________________________________________
MediaWiki-l mailing list
MediaWiki-l@Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Re: AW: converting from latin1 to unicode? [ In reply to ]
> Still a bit confused:
> What I do not understand is the fact, phpmyadmin shows broken
> special-characters in the database. --> This seams to me, like there are
> (at
> least) to views to the database - one using the mediawiki-script and
> another
> one using the phpmyadmin-script.

I don't know much about mysql and do prefer postgreSQL but as far as I
know does mysql know nothing about unicode yet. But this is about to
change if I'm not mistaken.
postgre on the other hand not only knows unicode for ages but other useful
things too :-)
Maybe one day I can bring my wikis on a postgre database and abandon mysql
completely.

>
> Perfect confusion:
> Strange enough (see above), I'm running another German wiki on the same
> unix-machine under the same user, using another database. This second
> wiki
> is also set to use "de - Deutsch - Unicode" via mediawiki-config. There
> are
> no problems, browsing through wiki-pages with special-characters - and
> also
> phpmyadmin shows pretty Äs,Ös etc. in the database.

Tell your browser (I use Opera) to render fonts in unicode and the
database entries will look right.
Phpmyadmin on the other hand uses 8859-1 and will render them umlauts as
squares or something if you got a german translation of the interface
running.

Codepages are a long sad story.
But back in the late sixties nobody could imagine that non english
speakers would ever consider using a computer other than by feeding it
stripes of paper with holes in it. Or so.

--tic

>
> Conclusion:
> I'm happy, that the front-end for my wiki-users looks perfect and
> therefore
> thankful for the tics help.
> But I'm also curious, why things in the database look, like they look.
> Maybe, I have a problem in one of the databases, which mediawiki
> compensates
> at present, but this could cause troubles in the future - my belly
> says...
>
> Niki
>
> -----Ursprüngliche Nachricht-----
> Von: mediawiki-l-bounces@Wikimedia.org
> [mailto:mediawiki-l-bounces@Wikimedia.org] Im Auftrag von tic@tictric.net
> Gesendet: Dienstag, 12. Oktober 2004 00:05
> An: MediaWiki announcements and site admin list
> Betreff: Re: [Mediawiki-l] converting from latin1 to unicode?
>
>
> Am Mon, 11 Oct 2004 15:11:31 +0200 hat Nikolai Neumayer
> <neum@ai.wu-wien.ac.at> geschrieben:
>
>> Hello,
>>
>> I'm running a German wiki configured using latin1. Trying to use
>> unicode instead of latin1, all special characters like ÄÖÜ etc. are
>> broken in the browser (and also in the database, if I take a look into
>> the tables via phpmyadmin). This effect can be undone by running the
>> setup again configuring latin1. Not being a specialist in such things it
> seems to me
>> to
>
> You just make a dump of your database like
>
> $ man mysqldump
> $ mysqldump -u user -p --add-drop-tables database
> $ cp database.sql database.sql.backup
>
> Then run that database.sql file through a converter. I use recode and
> it's
> pretty powerful.
> If you use debian it's just an
>
> $ apt-get install recode
>
> $ man recode
> $ recode -l
> $ recode l1..u8 database.sql
>
> This should be fine normaly.
> Then load it back into the database.
>
> $ man mysql
> $ mysql -u user -p database < database.sql
>
> In any case make sure you read the according manpages before you use the
> commands to get the syntax right.
> I often make mistakes :-)
>
> Of course you can use phpmyadmin for the dumping and loading but I think
> it's a bit long-winded for this task.
>
> Unix is cool :-)
>
> --tic
>
> PS: Greetings to the city I was born in :-)
>
>> be more "modern" to use unicode. Is this correct, or should I just
>> leave latin1. If it would make sense to convert to unicode, how can I
>> do this? What are the consequences of using the one or the other?
>>
>> Sorry for this questions of one mainly just beeing a "user" but who
>> also does some things on a server. Any hint, even in the form of
>> hyperlinks to further reading would be appreciated.
>>
>> Nikolai
>>
>> _______________________________________________
>> MediaWiki-l mailing list
>> MediaWiki-l@Wikimedia.org
>> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
AW: AW: converting from latin1 to unicode? [ In reply to ]
Thanks for your thoughts, tic, I'm now a bit closer to understanding...

Nikolai

-----Ursprüngliche Nachricht-----
Von: mediawiki-l-bounces@Wikimedia.org
[mailto:mediawiki-l-bounces@Wikimedia.org] Im Auftrag von tic@tictric.net
Gesendet: Mittwoch, 13. Oktober 2004 00:23
An: MediaWiki announcements and site admin list
Betreff: Re: AW: [Mediawiki-l] converting from latin1 to unicode?


> Still a bit confused:
> What I do not understand is the fact, phpmyadmin shows broken
> special-characters in the database. --> This seams to me, like there are
> (at
> least) to views to the database - one using the mediawiki-script and
> another
> one using the phpmyadmin-script.

I don't know much about mysql and do prefer postgreSQL but as far as I
know does mysql know nothing about unicode yet. But this is about to
change if I'm not mistaken.
postgre on the other hand not only knows unicode for ages but other useful
things too :-)
Maybe one day I can bring my wikis on a postgre database and abandon mysql
completely.

>
> Perfect confusion:
> Strange enough (see above), I'm running another German wiki on the
> same unix-machine under the same user, using another database. This second
> wiki
> is also set to use "de - Deutsch - Unicode" via mediawiki-config. There
> are
> no problems, browsing through wiki-pages with special-characters - and
> also
> phpmyadmin shows pretty Äs,Ös etc. in the database.

Tell your browser (I use Opera) to render fonts in unicode and the
database entries will look right.
Phpmyadmin on the other hand uses 8859-1 and will render them umlauts as
squares or something if you got a german translation of the interface
running.

Codepages are a long sad story.
But back in the late sixties nobody could imagine that non english
speakers would ever consider using a computer other than by feeding it
stripes of paper with holes in it. Or so.

--tic

>
> Conclusion:
> I'm happy, that the front-end for my wiki-users looks perfect and
> therefore
> thankful for the tics help.
> But I'm also curious, why things in the database look, like they look.
> Maybe, I have a problem in one of the databases, which mediawiki
> compensates
> at present, but this could cause troubles in the future - my belly
> says...
>
> Niki
>
> -----Ursprüngliche Nachricht-----
> Von: mediawiki-l-bounces@Wikimedia.org
> [mailto:mediawiki-l-bounces@Wikimedia.org] Im Auftrag von
> tic@tictric.net
> Gesendet: Dienstag, 12. Oktober 2004 00:05
> An: MediaWiki announcements and site admin list
> Betreff: Re: [Mediawiki-l] converting from latin1 to unicode?
>
>
> Am Mon, 11 Oct 2004 15:11:31 +0200 hat Nikolai Neumayer
> <neum@ai.wu-wien.ac.at> geschrieben:
>
>> Hello,
>>
>> I'm running a German wiki configured using latin1. Trying to use
>> unicode instead of latin1, all special characters like ÄÖÜ etc. are
>> broken in the browser (and also in the database, if I take a look
>> into the tables via phpmyadmin). This effect can be undone by running
>> the setup again configuring latin1. Not being a specialist in such
>> things it
> seems to me
>> to
>
> You just make a dump of your database like
>
> $ man mysqldump
> $ mysqldump -u user -p --add-drop-tables database
> $ cp database.sql database.sql.backup
>
> Then run that database.sql file through a converter. I use recode and
> it's
> pretty powerful.
> If you use debian it's just an
>
> $ apt-get install recode
>
> $ man recode
> $ recode -l
> $ recode l1..u8 database.sql
>
> This should be fine normaly.
> Then load it back into the database.
>
> $ man mysql
> $ mysql -u user -p database < database.sql
>
> In any case make sure you read the according manpages before you use
> the commands to get the syntax right. I often make mistakes :-)
>
> Of course you can use phpmyadmin for the dumping and loading but I
> think it's a bit long-winded for this task.
>
> Unix is cool :-)
>
> --tic
>
> PS: Greetings to the city I was born in :-)
>
>> be more "modern" to use unicode. Is this correct, or should I just
>> leave latin1. If it would make sense to convert to unicode, how can I
>> do this? What are the consequences of using the one or the other?
>>
>> Sorry for this questions of one mainly just beeing a "user" but who
>> also does some things on a server. Any hint, even in the form of
>> hyperlinks to further reading would be appreciated.
>>
>> Nikolai
>>
>> _______________________________________________
>> MediaWiki-l mailing list
>> MediaWiki-l@Wikimedia.org
>> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l@Wikimedia.org
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l


_______________________________________________
MediaWiki-l mailing list
MediaWiki-l@Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Re: AW: AW: converting from latin1 to unicode? [ In reply to ]
Am Wed, 13 Oct 2004 10:35:55 +0200 hat Nikolai Neumayer
<neum@ai.wu-wien.ac.at> geschrieben:

> Thanks for your thoughts, tic, I'm now a bit closer to understanding...
>
> Nikolai

http://tldp.org/HOWTO/Unicode-HOWTO-1.html
http://www.unicode.org/