Mailing List Archive

[ python-Bugs-1770551 ] words able to decode but unable to encode in GB18030
Bugs item #1770551, was opened at 2007-08-09 01:34
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Z-flagship (zaex)
Assigned to: M.-A. Lemburg (lemburg)
Summary: words able to decode but unable to encode in GB18030

Initial Comment:
Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
䎬䎱䅟䌷䦟䦷䲠㧏㭎㘚㘎㱮䴔䴖䴗䦆㧟䙡䙌䴕䁖䎬䴙䥽䝼䞍䓖䲡䥇䦂䦅䴓㩳㧐㳠䲢䴘㖞䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘䏝䦃

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[ python-Bugs-1770551 ] words able to decode but unable to encode in GB18030 [ In reply to ]
Bugs item #1770551, was opened at 2007-08-09 01:34
Message generated for change (Comment added) made by zaex
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Z-flagship (zaex)
Assigned to: M.-A. Lemburg (lemburg)
Summary: words able to decode but unable to encode in GB18030

Initial Comment:
Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
䎬䎱䅟䌷䦟䦷䲠㧏㭎㘚㘎㱮䴔䴖䴗䦆㧟䙡䙌䴕䁖䎬䴙䥽䝼䞍䓖䲡䥇䦂䦅䴓㩳㧐㳠䲢䴘㖞䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘䏝䦃

----------------------------------------------------------------------

>Comment By: Z-flagship (zaex)
Date: 2007-08-09 01:37

Message:
Logged In: YES
user_id=1863611
Originator: YES

The Python is Python2.5 , my OS is windows XP professional sp2 version
2002

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[ python-Bugs-1770551 ] words able to decode but unable to encode in GB18030 [ In reply to ]
Bugs item #1770551, was opened at 2007-08-08 18:34
Message generated for change (Comment added) made by nnorwitz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Z-flagship (zaex)
>Assigned to: Hye-Shik Chang (perky)
Summary: words able to decode but unable to encode in GB18030

Initial Comment:
Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
䎬䎱䅟䌷䦟䦷䲠㧏㭎㘚㘎㱮䴔䴖䴗䦆㧟䙡䙌䴕䁖䎬䴙䥽䝼䞍䓖䲡䥇䦂䦅䴓㩳㧐㳠䲢䴘㖞䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘䏝䦃

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2007-08-09 20:35

Message:
Logged In: YES
user_id=33168
Originator: NO

This seems like a cjk problem. Hye-Shik, could you take a look?

----------------------------------------------------------------------

Comment By: Z-flagship (zaex)
Date: 2007-08-08 18:37

Message:
Logged In: YES
user_id=1863611
Originator: YES

The Python is Python2.5 , my OS is windows XP professional sp2 version
2002

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
[ python-Bugs-1770551 ] words able to decode but unable to encode in GB18030 [ In reply to ]
Bugs item #1770551, was opened at 2007-08-09 10:34
Message generated for change (Comment added) made by perky
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.5
>Status: Closed
>Resolution: Duplicate
Priority: 5
Private: No
Submitted By: Z-flagship (zaex)
Assigned to: Hye-Shik Chang (perky)
Summary: words able to decode but unable to encode in GB18030

Initial Comment:
Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
䎬䎱䅟䌷䦟䦷䲠㧏㭎㘚㘎㱮䴔䴖䴗䦆㧟䙡䙌䴕䁖䎬䴙䥽䝼䞍䓖䲡䥇䦂䦅䴓㩳㧐㳠䲢䴘㖞䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘䏝䦃

----------------------------------------------------------------------

>Comment By: Hye-Shik Chang (perky)
Date: 2007-08-13 00:18

Message:
Logged In: YES
user_id=55188
Originator: NO

The problem has been fixed about a week ago. (r56727-8)
It will be okay on the forthcoming Python releases. Thank you for
reporting!


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2007-08-10 12:35

Message:
Logged In: YES
user_id=33168
Originator: NO

This seems like a cjk problem. Hye-Shik, could you take a look?

----------------------------------------------------------------------

Comment By: Z-flagship (zaex)
Date: 2007-08-09 10:37

Message:
Logged In: YES
user_id=1863611
Originator: YES

The Python is Python2.5 , my OS is windows XP professional sp2 version
2002

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1770551&group_id=5470
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com