Mailing List Archive

Re: [XML-SIG] Python 1.6a2 Unicode experiences?
Andy Robinson wrote:
>
> - you can work with old fashioned strings, which are understood
> by everyone to be arrays of bytes, and there is no magic
> conversion going on. The bytes in literal strings in your script file
> are the bytes that end up in the program.

Who is "everyone"? Are you saying that CP4E hordes are going to
understand that the syntax "abcde" is constructing a *byte array*? It
seems like you think that Python users are going to be more
sophisticated in their understanding of these issues than Java
programmers. In most other things, Python is simpler.

> ...
>
> I'm also convinced that the majority of Python scripts won't need
> to work in Unicode.

Anything working with XML will need to be Unicode. Anything working with
the Win32 API (especially COM) will want to do Unicode. Over time the
entire Web infrastructure will move to Unicode. Anything written in
JPython pretty much MOST use Unicode (doesn't it?).

> Even working with exotic languages, there is always a native
> 8-bit encoding.

Unicode has many encodings: Shift-JIS, Big-5, EBCDIC ... You can use
8-bit encodings of Unicode if you want.

--
Paul Prescod - ISOGEN Consulting Engineer speaking for himself
It's difficult to extract sense from strings, but they're the only
communication coin we can count on.
- http://www.cs.yale.edu/~perlis-alan/quotes.html
Re: Re: [XML-SIG] Python 1.6a2 Unicode experiences? [ In reply to ]
Paul Prescod [paul@prescod.net] wrote:
> > I'm also convinced that the majority of Python scripts won't need
> > to work in Unicode.
>
> Anything working with XML will need to be Unicode. Anything working with
> the Win32 API (especially COM) will want to do Unicode. Over time the
> entire Web infrastructure will move to Unicode. Anything written in
> JPython pretty much MOST use Unicode (doesn't it?).

I disagree with this. Unicode has been a very long time, and it's not
been adopted by a lot of people for a LOT of very valid reasons.

> > Even working with exotic languages, there is always a native
> > 8-bit encoding.
>
> Unicode has many encodings: Shift-JIS, Big-5, EBCDIC ... You can use
> 8-bit encodings of Unicode if you want.

Um, if you go:

JIS -> Unicode -> JIS

you don't get the same thing out that you put in (at least this is
what I've been told by a lot of Japanese developers), and therefore
it's not terribly popular because of the nature of the Japanese (and
Chinese) langauge.

My experience with Unicode is that a lot of Western people think it's
the answer to every problem asked, while most asian language people
disagree vehemently. This says the problem isn't solved yet, even if
people wish to deny it.

Chris
--
| Christopher Petrilli
| petrilli@amber.org
Re: Re: [XML-SIG] Python 1.6a2 Unicode experiences? [ In reply to ]
[Note: These discussion should all move to 18n-sig... CCing there]

Christopher Petrilli wrote:
>
> Paul Prescod [paul@prescod.net] wrote:
> > > Even working with exotic languages, there is always a native
> > > 8-bit encoding.
> >
> > Unicode has many encodings: Shift-JIS, Big-5, EBCDIC ... You can use
> > 8-bit encodings of Unicode if you want.
>
> Um, if you go:
>
> JIS -> Unicode -> JIS
>
> you don't get the same thing out that you put in (at least this is
> what I've been told by a lot of Japanese developers), and therefore
> it's not terribly popular because of the nature of the Japanese (and
> Chinese) langauge.
>
> My experience with Unicode is that a lot of Western people think it's
> the answer to every problem asked, while most asian language people
> disagree vehemently. This says the problem isn't solved yet, even if
> people wish to deny it.

Isn't this a problem of the translation rather than Unicode
itself (Andy mentioned several times that you can use the private
BMP areas to implement 1-1 round-trips) ?

--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Re: [XML-SIG] Python 1.6a2 Unicode experiences? [ In reply to ]
> [Note: These discussion should all move to 18n-sig... CCing there]
>
> Christopher Petrilli wrote:
> > you don't get the same thing out that you put in (at least this is
> > what I've been told by a lot of Japanese developers), and therefore
> > it's not terribly popular because of the nature of the Japanese (and
> > Chinese) langauge.
> >
> > My experience with Unicode is that a lot of Western people think it's
> > the answer to every problem asked, while most asian language people
> > disagree vehemently. This says the problem isn't solved yet, even if
> > people wish to deny it.

[Marc-Andre Lenburg]
> Isn't this a problem of the translation rather than Unicode
> itself (Andy mentioned several times that you can use the private
> BMP areas to implement 1-1 round-trips) ?

Maybe, but apparently such high-quality translations are rare (note
that Andy said "can").

Anyway, a word of caution here. Years ago I attended a number of IETF
meetings on internationalization, in a time when Unicode wasn't as
accepted as it is now. The one thing I took away from those meetings
was that this is a *highly* emotional and controversial issue.

As the Python community, I feel we have no need to discuss "why
Unicode." Therein lies madness, controversy, and no progress. We
know there's a clear demand for Unicode, and we've committed to
support it. The question now at hand is "how Unicode." Let's please
focus on that, e.g. in the other thread ("Unicode debate") in i18n-sig
and python-dev.

--Guido van Rossum (home page: http://www.python.org/~guido/)