Mailing List Archive

MythWeb translation problem
Hi!

I am working on a Norwegian translation of MythWeb from scratch. The
aim is to have this ready by the end of the week, and include it with
a Norwegian translation update which includes tranlsation updates to
mythfrontend plus translation of all the plugins.

I have translated most of the strings, but I get the following php
errors on some of the pages:
"
Warning at /usr/share/mythtv/mythweb/includes/utils.php, line 195:
!!NoTrans: htmlentities(): Invalid multibyte sequence in argument!!

________________________________
________________________________

Warning at /usr/share/mythtv/mythweb/includes/utils.php, line 195:
!!NoTrans: htmlentities(): Invalid multibyte sequence in argument!!

________________________________
________________________________

Warning at /usr/share/mythtv/mythweb/includes/utils.php, line 195:
!!NoTrans: htmlentities(): Invalid multibyte sequence in argument!!
"

These errors are thrown at all the MythTV recording/scheduling pages,
but not for the Main page, MythWeather, MythMusic, MythVideo and
Settings page.

I have checked that the "special" norwegian characters are encoded as
UTF-8 in the .lang file, and googled back and forth on the error
message.
http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-invalid
indicates that there is a bug in PHP for when this waring message is
displayed, and when I set display_errors to true in php.ini the
display of the warning goes away.

This is on ubuntu lucid with php 5.

Has anyone else run into this problem when translating MythWeb ?

This warning is not triggered when language is set to English or
Danish (Danish uses æøå as well in the translation).

Best regards,

Rune
MythWeb translation problem [ In reply to ]
Hi!

> I am working on a Norwegian translation of MythWeb from scratch. The aim
> is to have this ready by the end of the week, and include it with a
> Norwegian translation update which includes tranlsation updates to
> mythfrontend plus translation of all the plugins.

Thank you!

>
> I have translated most of the strings, but I get the following php
> errors on some of the pages:
> "
> Warning at /usr/share/mythtv/mythweb/includes/utils.php, line 195:
> !!NoTrans: htmlentities(): Invalid multibyte sequence in argument!!

Hmm, sound like it doesn't like the encoding of some of your UTF-8
strings...

>
> I have checked that the "special" norwegian characters are encoded as
> UTF-8 in the .lang file, and googled back and forth on the error
> message.
> http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-
> invalid indicates that there is a bug in PHP for when this waring
> message is displayed, and when I set display_errors to true in php.ini
> the display of the warning goes away.

Do you know which strings are affected in your translation? If you edit
them slightly to remove some or all special characters do you still get the
error?

> This is on ubuntu lucid with php 5.
>
> Has anyone else run into this problem when translating MythWeb ?
> This warning is not triggered when language is set to English or Danish
> (Danish uses æøå as well in the translation).

Does it seem to display Danish properly?

(ie nothing out of the ordinary appears except for the fact that it's in
another language).

My backend currently has what appears to be an hardware problem which I'm
hoping to fix tonight (by replacing what I think is the defective piece of
hardware) and if it fix the problem I could try to help you track down the
problem if you send me your MythWeb translation file.

Good luck and have a nice day!

Nicolas
MythWeb translation problem [ In reply to ]
2010/9/23 Nicolas Riendeau <knight at teksavvy.com>
>
> Hi!
>
> > I am working on a Norwegian translation of MythWeb from scratch. The aim
> > is to have this ready by the end of the week, and include it with a
> > Norwegian translation update which includes tranlsation updates to
> > mythfrontend plus translation of all the plugins.
>
> Thank you!
>
> >
> > I have translated most of the strings, but I get the following php
> > errors on some of the pages:
> > "
> > Warning at /usr/share/mythtv/mythweb/includes/utils.php, line 195:
> > !!NoTrans: htmlentities(): Invalid multibyte sequence in argument!!
>
> Hmm, sound like it doesn't like the encoding of some of your UTF-8
> strings...
>
> >
> > I have checked that the "special" norwegian characters are encoded as
> > UTF-8 in the .lang file, and googled back and forth on the error
> > message.
> > http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-
> > invalid indicates that there is a bug in PHP for when this waring
> > message is displayed, and when I set display_errors to true in php.ini
> > the display of the warning goes away.
>
> Do you know which strings are affected in your translation? If you edit
> them slightly to remove some or all special characters do you still get the
> error?
>
> > This is on ubuntu lucid with php 5.
> >
> > Has anyone else run into this problem when translating MythWeb ?
> > This warning is not triggered when language is set to English or Danish
> > (Danish uses æøå as well in the translation).
>
> Does it seem to display Danish properly?
>
> (ie nothing out of the ordinary appears except for the fact that it's in
> another language).
>
> My backend currently has what appears to be an hardware problem which I'm
> hoping to fix tonight (by replacing what I think is the defective piece of
> hardware) and if it fix the problem I could try to help you track down the
> problem if you send me your MythWeb translation file.
>
> Good luck and have a nice day!

Thank you Nicolas,

I checked and double checked the .lang file, replaced all special
characters, and verified the file encoding.

But then I checked the Norwegian_NB.cat file, and there the answer was!
file -i Norwegian_NB.cat
Norwegian_NB.cat: text/plain; charset=iso-8859-1

After I updated the file with the correct encoding (and updating som
regex strings) everything was fine.

But it is a strange thing, since this is the output for the Danish translation:
file -i Danish.cat
Danish.cat: text/plain; charset=iso-8859-1
file -i Danish.lang
Danish.lang: text/plain; charset=utf-8

..and the Danish translation works fine.

So possibly/probably/maybe I might have had a spelling error in the
regex portion of the .cat file which I fixed in the
convert-to-utf8-process.

Thank you for the response and the help.

Best regards,

Rune
MythWeb translation problem [ In reply to ]
Hi Rune!

On 9/23/2010 3:56 PM, Rune Evjen wrote:
> I checked and double checked the .lang file, replaced all special
> characters, and verified the file encoding.

OK...

>
> But then I checked the Norwegian_NB.cat file, and there the answer was!
> file -i Norwegian_NB.cat
> Norwegian_NB.cat: text/plain; charset=iso-8859-1

Hmm.. Good catch!

>
> After I updated the file with the correct encoding (and updating som
> regex strings) everything was fine.
>
> But it is a strange thing, since this is the output for the Danish translation:
> file -i Danish.cat
> Danish.cat: text/plain; charset=iso-8859-1

That's very peculiar considering that file clearly has UTF-8 encoded
characters.

(You can see many multibyte characters in that file.)

UTF-8 and ISO-8859-1 do share some character positions and both are
supersets of US ASCII but these characters are not encoded in the same way.

The BOM (Byte Order Mark) could help identify one format over the other
but as far as I can remember that's not mandatory...

I wonder how "file" identifies the file character encoding...

>
> So possibly/probably/maybe I might have had a spelling error in the
> regex portion of the .cat file which I fixed in the
> convert-to-utf8-process.

I guess so...

>
> Thank you for the response and the help.
>

No problem!

Have a nice day!

Nicolas
MythWeb translation problem [ In reply to ]
2010/9/23 Rune Evjen <rune.evjen at gmail.com>:
> But it is a strange thing, since this is the output for the Danish translation:
>  file -i Danish.cat
>  Danish.cat: text/plain; charset=iso-8859-1
>  file -i Danish.lang
>  Danish.lang: text/plain; charset=utf-8

Thanks for spotting this, I've cleaned up the Danish encoding in r26492 :)

Best regards
Kenni
MythWeb translation problem [ In reply to ]
Hi!

> 2010/9/23 Rune Evjen <rune.evjen at gmail.com>:
> > But it is a strange thing, since this is the output for the Danish
> translation: >  file -i Danish.cat
> >  Danish.cat: text/plain; charset=iso-8859-1
> >  file -i Danish.lang
> >  Danish.lang: text/plain; charset=utf-8
>
> Thanks for spotting this, I've cleaned up the Danish encoding in r26492
> :)
> Best regards
> Kenni

Hmm, that's interesting... I've just learned that the ssh client program I
use to remotely access my backend actually remaps some characters before
displaying them.

I could clearly see multi-byte characters where you applied those fixes...

I've just installed a vi-like hex editor called bvi to make sure I don't
get fooled by my ssh client next time...

Have a nice day!

Nick