Mailing List Archive

Issue with encoding
Hello,
I'm working on one wiki which shows characters in a weird way. UTF-8 is
used for encoding, so I believe that it isn't an issue.

You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
<http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can better
understand what I'm talking about.

Could you please point me to something that I should look for, so I can fix
this issue?
Wiki was previously on version 1.31, I've upgraded it to 1.38.

Thanks for your help and understanding!

Best regards,
Zoran
Re: Issue with encoding [ In reply to ]
ä is what you get when you take ä encoded as UTF-8 and interpret it as
ISO-8859-1. So what probably happened, is that some text that was encoded
as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily)
converted to UTF-8.

Common causes of this sort of thing:
- Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
would have to be truly ancient).
- Restoring a DB from backup with some wrong options related to charset
- converting the charset of DB columns if they were originally mislabeled.

If its the entire DB that is broken, I think the easiest fix might be to
take a DB dump, and use the iconv command line tool to convert UTF-8 ->
windows-1252 (To undo one layer of conversion) and then import the result
as if it was UTF-8.

--
brian.


On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com> wrote:

> Hello,
> I'm working on one wiki which shows characters in a weird way. UTF-8 is
> used for encoding, so I believe that it isn't an issue.
>
> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can better
> understand what I'm talking about.
>
> Could you please point me to something that I should look for, so I can
> fix this issue?
> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>
> Thanks for your help and understanding!
>
> Best regards,
> Zoran
> _______________________________________________
> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>
> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
Hello Brian,
it was previously on version 1.31, I've upgraded it to the latest version,
hoping that it will resolve all issues. :)

I've checked PHPMyAdmin, and they gave me access to it, and it shows the
same article perfectly fine, without any weird characters, after I've
changed encoding from latin1 to UTF-8.

But on the wiki itself, a character is shown incorrectly still. Is there
any way where I can trigger MediaWiki to rebuild the database and show
things correctly?

Looking forward to your response. :)

Best regards,
Zoran

???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ?? ???????/??:

> ä is what you get when you take ä encoded as UTF-8 and interpret it as
> ISO-8859-1. So what probably happened, is that some text that was encoded
> as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily)
> converted to UTF-8.
>
> Common causes of this sort of thing:
> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
> would have to be truly ancient).
> - Restoring a DB from backup with some wrong options related to charset
> - converting the charset of DB columns if they were originally mislabeled.
>
> If its the entire DB that is broken, I think the easiest fix might be to
> take a DB dump, and use the iconv command line tool to convert UTF-8 ->
> windows-1252 (To undo one layer of conversion) and then import the result
> as if it was UTF-8.
>
> --
> brian.
>
>
> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com> wrote:
>
>> Hello,
>> I'm working on one wiki which shows characters in a weird way. UTF-8 is
>> used for encoding, so I believe that it isn't an issue.
>>
>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>> better understand what I'm talking about.
>>
>> Could you please point me to something that I should look for, so I can
>> fix this issue?
>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>
>> Thanks for your help and understanding!
>>
>> Best regards,
>> Zoran
>> _______________________________________________
>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>
> _______________________________________________
> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>
> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
Not that i am aware of. If there are any, they are probably from mediawiki
1.5 and dont work anymore.

One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove utf-8
from the old_flags field (
https://www.mediawiki.org/wiki/Manual:Text_table#old_flags ), so that
mediawiki thinks they are from pre mediawiki 1.5.

You could also try a query like the following (i have not tested this, use
at your own risk, have backups):
UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR CHARACTER
SET latin1) AS BINARY) USING utf8) where old_flags not like '%gzip%';

--
Brian

On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com> wrote:

> Hello Brian,
> it was previously on version 1.31, I've upgraded it to the latest version,
> hoping that it will resolve all issues. :)
>
> I've checked PHPMyAdmin, and they gave me access to it, and it shows the
> same article perfectly fine, without any weird characters, after I've
> changed encoding from latin1 to UTF-8.
>
> But on the wiki itself, a character is shown incorrectly still. Is there
> any way where I can trigger MediaWiki to rebuild the database and show
> things correctly?
>
> Looking forward to your response. :)
>
> Best regards,
> Zoran
>
> ???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ?? ???????/??:
>
>> ä is what you get when you take ä encoded as UTF-8 and interpret it as
>> ISO-8859-1. So what probably happened, is that some text that was encoded
>> as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily)
>> converted to UTF-8.
>>
>> Common causes of this sort of thing:
>> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
>> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
>> would have to be truly ancient).
>> - Restoring a DB from backup with some wrong options related to charset
>> - converting the charset of DB columns if they were originally mislabeled.
>>
>> If its the entire DB that is broken, I think the easiest fix might be to
>> take a DB dump, and use the iconv command line tool to convert UTF-8 ->
>> windows-1252 (To undo one layer of conversion) and then import the result
>> as if it was UTF-8.
>>
>> --
>> brian.
>>
>>
>> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com>
>> wrote:
>>
>>> Hello,
>>> I'm working on one wiki which shows characters in a weird way. UTF-8 is
>>> used for encoding, so I believe that it isn't an issue.
>>>
>>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>>> better understand what I'm talking about.
>>>
>>> Could you please point me to something that I should look for, so I can
>>> fix this issue?
>>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>>
>>> Thanks for your help and understanding!
>>>
>>> Best regards,
>>> Zoran
>>> _______________________________________________
>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.
>>> lists.wikimedia.org/
>>
>> _______________________________________________
>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.
>> lists.wikimedia.org/
>
>
Re: Issue with encoding [ In reply to ]
Actually the $wgLegacyEncoding thing probably wont work, because that would
fix if it needed to be converted to utf8, but the issue here is (i think)
it has been double converted to utf8


--
Brian

On Sunday, September 11, 2022, Brian Wolff <bawolff@gmail.com> wrote:

> Not that i am aware of. If there are any, they are probably from mediawiki
> 1.5 and dont work anymore.
>
> One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove
> utf-8 from the old_flags field (https://www.mediawiki.org/
> wiki/Manual:Text_table#old_flags ), so that mediawiki thinks they are
> from pre mediawiki 1.5.
>
> You could also try a query like the following (i have not tested this, use
> at your own risk, have backups):
> UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR CHARACTER
> SET latin1) AS BINARY) USING utf8) where old_flags not like '%gzip%';
>
> --
> Brian
>
> On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com> wrote:
>
>> Hello Brian,
>> it was previously on version 1.31, I've upgraded it to the latest
>> version, hoping that it will resolve all issues. :)
>>
>> I've checked PHPMyAdmin, and they gave me access to it, and it shows the
>> same article perfectly fine, without any weird characters, after I've
>> changed encoding from latin1 to UTF-8.
>>
>> But on the wiki itself, a character is shown incorrectly still. Is there
>> any way where I can trigger MediaWiki to rebuild the database and show
>> things correctly?
>>
>> Looking forward to your response. :)
>>
>> Best regards,
>> Zoran
>>
>> ???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ?? ???????/??:
>>
>>> ä is what you get when you take ä encoded as UTF-8 and interpret it as
>>> ISO-8859-1. So what probably happened, is that some text that was encoded
>>> as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily)
>>> converted to UTF-8.
>>>
>>> Common causes of this sort of thing:
>>> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
>>> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
>>> would have to be truly ancient).
>>> - Restoring a DB from backup with some wrong options related to charset
>>> - converting the charset of DB columns if they were originally
>>> mislabeled.
>>>
>>> If its the entire DB that is broken, I think the easiest fix might be to
>>> take a DB dump, and use the iconv command line tool to convert UTF-8 ->
>>> windows-1252 (To undo one layer of conversion) and then import the result
>>> as if it was UTF-8.
>>>
>>> --
>>> brian.
>>>
>>>
>>> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>> I'm working on one wiki which shows characters in a weird way. UTF-8 is
>>>> used for encoding, so I believe that it isn't an issue.
>>>>
>>>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>>>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>>>> better understand what I'm talking about.
>>>>
>>>> Could you please point me to something that I should look for, so I can
>>>> fix this issue?
>>>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>>>
>>>> Thanks for your help and understanding!
>>>>
>>>> Best regards,
>>>> Zoran
>>>> _______________________________________________
>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.list
>>>> s.wikimedia.org/
>>>
>>> _______________________________________________
>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.list
>>> s.wikimedia.org/
>>
>>
Re: Issue with encoding [ In reply to ]
Thank you Brian for your suggestions. So I should make a backup of the
database and execute your query, right?

Best regards!

???, 11. ??? 2022. ? 22:17 Brian Wolff <bawolff@gmail.com> ?? ???????/??:

> Actually the $wgLegacyEncoding thing probably wont work, because that
> would fix if it needed to be converted to utf8, but the issue here is (i
> think) it has been double converted to utf8
>
>
> --
> Brian
>
> On Sunday, September 11, 2022, Brian Wolff <bawolff@gmail.com> wrote:
>
>> Not that i am aware of. If there are any, they are probably from
>> mediawiki 1.5 and dont work anymore.
>>
>> One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove
>> utf-8 from the old_flags field (
>> https://www.mediawiki.org/wiki/Manual:Text_table#old_flags ), so that
>> mediawiki thinks they are from pre mediawiki 1.5.
>>
>> You could also try a query like the following (i have not tested this,
>> use at your own risk, have backups):
>> UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR CHARACTER
>> SET latin1) AS BINARY) USING utf8) where old_flags not like '%gzip%';
>>
>> --
>> Brian
>>
>> On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com>
>> wrote:
>>
>>> Hello Brian,
>>> it was previously on version 1.31, I've upgraded it to the latest
>>> version, hoping that it will resolve all issues. :)
>>>
>>> I've checked PHPMyAdmin, and they gave me access to it, and it shows the
>>> same article perfectly fine, without any weird characters, after I've
>>> changed encoding from latin1 to UTF-8.
>>>
>>> But on the wiki itself, a character is shown incorrectly still. Is there
>>> any way where I can trigger MediaWiki to rebuild the database and show
>>> things correctly?
>>>
>>> Looking forward to your response. :)
>>>
>>> Best regards,
>>> Zoran
>>>
>>> ???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ??
>>> ???????/??:
>>>
>>>> ä is what you get when you take ä encoded as UTF-8 and interpret it as
>>>> ISO-8859-1. So what probably happened, is that some text that was encoded
>>>> as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily)
>>>> converted to UTF-8.
>>>>
>>>> Common causes of this sort of thing:
>>>> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
>>>> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
>>>> would have to be truly ancient).
>>>> - Restoring a DB from backup with some wrong options related to charset
>>>> - converting the charset of DB columns if they were originally
>>>> mislabeled.
>>>>
>>>> If its the entire DB that is broken, I think the easiest fix might be
>>>> to take a DB dump, and use the iconv command line tool to convert UTF-8 ->
>>>> windows-1252 (To undo one layer of conversion) and then import the result
>>>> as if it was UTF-8.
>>>>
>>>> --
>>>> brian.
>>>>
>>>>
>>>> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>> I'm working on one wiki which shows characters in a weird way. UTF-8
>>>>> is used for encoding, so I believe that it isn't an issue.
>>>>>
>>>>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>>>>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>>>>> better understand what I'm talking about.
>>>>>
>>>>> Could you please point me to something that I should look for, so I
>>>>> can fix this issue?
>>>>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>>>>
>>>>> Thanks for your help and understanding!
>>>>>
>>>>> Best regards,
>>>>> Zoran
>>>>> _______________________________________________
>>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>>
>>>> _______________________________________________
>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>>
>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>
>>> _______________________________________________
> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>
> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
That's what i would do. if you want to reduce risk, you could try and
identify a specific row of the text table to test it on to see if it works,
before doing the whole thing.

This query assumes that there is no text anywhere that is validly encoded
in UTF-8. If there is, it will probably get replaced with a ?

--
Brian


On Sun, Sep 11, 2022 at 8:36 PM Zoran Dori <zorandori4444@gmail.com> wrote:

> Thank you Brian for your suggestions. So I should make a backup of the
> database and execute your query, right?
>
> Best regards!
>
> ???, 11. ??? 2022. ? 22:17 Brian Wolff <bawolff@gmail.com> ?? ???????/??:
>
>> Actually the $wgLegacyEncoding thing probably wont work, because that
>> would fix if it needed to be converted to utf8, but the issue here is (i
>> think) it has been double converted to utf8
>>
>>
>> --
>> Brian
>>
>> On Sunday, September 11, 2022, Brian Wolff <bawolff@gmail.com> wrote:
>>
>>> Not that i am aware of. If there are any, they are probably from
>>> mediawiki 1.5 and dont work anymore.
>>>
>>> One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove
>>> utf-8 from the old_flags field (
>>> https://www.mediawiki.org/wiki/Manual:Text_table#old_flags ), so that
>>> mediawiki thinks they are from pre mediawiki 1.5.
>>>
>>> You could also try a query like the following (i have not tested this,
>>> use at your own risk, have backups):
>>> UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR
>>> CHARACTER SET latin1) AS BINARY) USING utf8) where old_flags not like
>>> '%gzip%';
>>>
>>> --
>>> Brian
>>>
>>> On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com>
>>> wrote:
>>>
>>>> Hello Brian,
>>>> it was previously on version 1.31, I've upgraded it to the latest
>>>> version, hoping that it will resolve all issues. :)
>>>>
>>>> I've checked PHPMyAdmin, and they gave me access to it, and it shows
>>>> the same article perfectly fine, without any weird characters, after I've
>>>> changed encoding from latin1 to UTF-8.
>>>>
>>>> But on the wiki itself, a character is shown incorrectly still. Is
>>>> there any way where I can trigger MediaWiki to rebuild the database and
>>>> show things correctly?
>>>>
>>>> Looking forward to your response. :)
>>>>
>>>> Best regards,
>>>> Zoran
>>>>
>>>> ???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ??
>>>> ???????/??:
>>>>
>>>>> ä is what you get when you take ä encoded as UTF-8 and interpret it
>>>>> as ISO-8859-1. So what probably happened, is that some text that was
>>>>> encoded as UTF-8 was treated as if it was ISO-8859-1/windows1252 and
>>>>> (unessearily) converted to UTF-8.
>>>>>
>>>>> Common causes of this sort of thing:
>>>>> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
>>>>> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
>>>>> would have to be truly ancient).
>>>>> - Restoring a DB from backup with some wrong options related to charset
>>>>> - converting the charset of DB columns if they were originally
>>>>> mislabeled.
>>>>>
>>>>> If its the entire DB that is broken, I think the easiest fix might be
>>>>> to take a DB dump, and use the iconv command line tool to convert UTF-8 ->
>>>>> windows-1252 (To undo one layer of conversion) and then import the result
>>>>> as if it was UTF-8.
>>>>>
>>>>> --
>>>>> brian.
>>>>>
>>>>>
>>>>> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>> I'm working on one wiki which shows characters in a weird way. UTF-8
>>>>>> is used for encoding, so I believe that it isn't an issue.
>>>>>>
>>>>>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>>>>>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>>>>>> better understand what I'm talking about.
>>>>>>
>>>>>> Could you please point me to something that I should look for, so I
>>>>>> can fix this issue?
>>>>>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>>>>>
>>>>>> Thanks for your help and understanding!
>>>>>>
>>>>>> Best regards,
>>>>>> Zoran
>>>>>> _______________________________________________
>>>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>>>
>>>>> _______________________________________________
>>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>>
>>>> _______________________________________________
>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>
> _______________________________________________
> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>
> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
Your query has worked, but I also made a backup of the text table, before
executing your query, by copying it to text-backup, to have it just in case
of need.

Thank you so much for your time and help! :)

Best regards,
Zoran

???, 11. ??? 2022. ? 22:47 Brian Wolff <bawolff@gmail.com> ?? ???????/??:

> That's what i would do. if you want to reduce risk, you could try and
> identify a specific row of the text table to test it on to see if it works,
> before doing the whole thing.
>
> This query assumes that there is no text anywhere that is validly encoded
> in UTF-8. If there is, it will probably get replaced with a ?
>
> --
> Brian
>
>
> On Sun, Sep 11, 2022 at 8:36 PM Zoran Dori <zorandori4444@gmail.com>
> wrote:
>
>> Thank you Brian for your suggestions. So I should make a backup of the
>> database and execute your query, right?
>>
>> Best regards!
>>
>> ???, 11. ??? 2022. ? 22:17 Brian Wolff <bawolff@gmail.com> ?? ???????/??:
>>
>>> Actually the $wgLegacyEncoding thing probably wont work, because that
>>> would fix if it needed to be converted to utf8, but the issue here is (i
>>> think) it has been double converted to utf8
>>>
>>>
>>> --
>>> Brian
>>>
>>> On Sunday, September 11, 2022, Brian Wolff <bawolff@gmail.com> wrote:
>>>
>>>> Not that i am aware of. If there are any, they are probably from
>>>> mediawiki 1.5 and dont work anymore.
>>>>
>>>> One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove
>>>> utf-8 from the old_flags field (
>>>> https://www.mediawiki.org/wiki/Manual:Text_table#old_flags ), so that
>>>> mediawiki thinks they are from pre mediawiki 1.5.
>>>>
>>>> You could also try a query like the following (i have not tested this,
>>>> use at your own risk, have backups):
>>>> UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR
>>>> CHARACTER SET latin1) AS BINARY) USING utf8) where old_flags not like
>>>> '%gzip%';
>>>>
>>>> --
>>>> Brian
>>>>
>>>> On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Brian,
>>>>> it was previously on version 1.31, I've upgraded it to the latest
>>>>> version, hoping that it will resolve all issues. :)
>>>>>
>>>>> I've checked PHPMyAdmin, and they gave me access to it, and it shows
>>>>> the same article perfectly fine, without any weird characters, after I've
>>>>> changed encoding from latin1 to UTF-8.
>>>>>
>>>>> But on the wiki itself, a character is shown incorrectly still. Is
>>>>> there any way where I can trigger MediaWiki to rebuild the database and
>>>>> show things correctly?
>>>>>
>>>>> Looking forward to your response. :)
>>>>>
>>>>> Best regards,
>>>>> Zoran
>>>>>
>>>>> ???, 10. ??? 2022. ? 01:57 Brian Wolff <bawolff@gmail.com> ??
>>>>> ???????/??:
>>>>>
>>>>>> ä is what you get when you take ä encoded as UTF-8 and interpret it
>>>>>> as ISO-8859-1. So what probably happened, is that some text that was
>>>>>> encoded as UTF-8 was treated as if it was ISO-8859-1/windows1252 and
>>>>>> (unessearily) converted to UTF-8.
>>>>>>
>>>>>> Common causes of this sort of thing:
>>>>>> - Very very old wiki from before MediaWiki adopted UTF-8 that wasn't
>>>>>> upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it
>>>>>> would have to be truly ancient).
>>>>>> - Restoring a DB from backup with some wrong options related to
>>>>>> charset
>>>>>> - converting the charset of DB columns if they were originally
>>>>>> mislabeled.
>>>>>>
>>>>>> If its the entire DB that is broken, I think the easiest fix might be
>>>>>> to take a DB dump, and use the iconv command line tool to convert UTF-8 ->
>>>>>> windows-1252 (To undo one layer of conversion) and then import the result
>>>>>> as if it was UTF-8.
>>>>>>
>>>>>> --
>>>>>> brian.
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> I'm working on one wiki which shows characters in a weird way. UTF-8
>>>>>>> is used for encoding, so I believe that it isn't an issue.
>>>>>>>
>>>>>>> You can take a look here Statik A – Sub Bavaria (sub-bavaria.de)
>>>>>>> <http://www.sub-bavaria.de/w/index.php?title=Statik_A>, so you can
>>>>>>> better understand what I'm talking about.
>>>>>>>
>>>>>>> Could you please point me to something that I should look for, so I
>>>>>>> can fix this issue?
>>>>>>> Wiki was previously on version 1.31, I've upgraded it to 1.38.
>>>>>>>
>>>>>>> Thanks for your help and understanding!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Zoran
>>>>>>> _______________________________________________
>>>>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>>>>> To unsubscribe send an email to
>>>>>>> mediawiki-l-leave@lists.wikimedia.org
>>>>>>>
>>>>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>>>>
>>>>>> _______________________________________________
>>>>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>>>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>>>>
>>>>> _______________________________________________
>>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>>
>>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>>
>> _______________________________________________
>> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
>> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
>
> _______________________________________________
> MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
> To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
>
> https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
Hi Zoran,

Why did you sign up for my project on Upwork, then not reply for 3 weeks, then reply saying, " I want to work with you, Steve." Then not reply for weeks again?

This is very untrustworthy! You cannot take someone's code, their data, and their website and just not reply.

This is very unethical as a programmer.

You should send me some reply.

- Steve client from Upwork
_______________________________________________
MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
Re: Issue with encoding [ In reply to ]
On Wed, 2023-01-11 at 01:09 +0000, irvingwash--- via MediaWiki-l wrote:
> You should send me some reply.

This is a public community mailing list.
Posting to individuals about potential previous communication that may
(or may not) have taken place in private or third-party venues, in this
case by replying to unrelated year-old list threads, is not welcome.

Please take this elsewhere; it does not belong here. Please check also
https://www.mediawiki.org/wiki/Code_of_Conduct how to phrase things.

Thanks,
andre

--
Andre Klapper (he/him) | Bugwrangler / Developer Advocate
https://blogs.gnome.org/aklapper/
_______________________________________________
MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/