"nom <d@d.dd>" wrote:
> i'm from france
hmm. last time I checked, france was ".fr",
not ".dd"...
> then i use accent : i try to convert a string :
> 'Soci\351t\351' to 'Société'
> how can do it?
print it?
>>> print 'Soci\351t\351'
Société
...
Python is mostly agnostic when it comes to
character sets -- strings just contain 8-bit
characters, and it's up to the programmer
to make sure they're interpreted in the right
way on input or output. in your case,
'Soci\351t\351' is an ISO Latin 1 string (also
known as ISO 8859-1). so if your environment
uses ISO Latin 1, it just works. on the other
hand, if your environment were to use, say,
IBM's old PC encoding (like in the MS-DOS
window under Windows), it would come out
as:
>>> print 'Soci\351t\351'
SociÚtÚ
and in an UTF-8 environment, it's an
illegal string:
>>> print unicode('Soci\351t\351')
Traceback (innermost last):
File "<stdin>", line 1, in ?
ValueError: invalid UTF-8 code
...
so I guess the answer to your question is
"depends on what you're doing..."
but before you try to explain that, please
take a look at Jukka Korpela's character
code tutorial, available from:
http://www.hut.fi/~jkorpela/chars.html "This document in itself does not
contain solutions to practical problems
with character codes; rather, it gives
background information needed for
understanding what solutions there
might be, what the different solutions
do - and what's really the problem in
the first place."
</F>