New submission from paul rubin <phr@users.sourceforge.net>:
For object serialization and some other purposes, Java encodes unicode
strings with a modified version of utf-8:
http://en.wikipedia.org/wiki/UTF-8#Java
http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8
It is used in Lucene index files among other places.
It would be useful if Python had a codec for this, maybe called "UTF-8J"
or something like that.
----------
components: Library (Lib)
messages: 66843
nosy: phr
severity: normal
status: open
title: add coded for java modified utf-8
versions: Python 2.5
__________________________________
Tracker <report@bugs.python.org>
<http://bugs.python.org/issue2857>
__________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com
For object serialization and some other purposes, Java encodes unicode
strings with a modified version of utf-8:
http://en.wikipedia.org/wiki/UTF-8#Java
http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8
It is used in Lucene index files among other places.
It would be useful if Python had a codec for this, maybe called "UTF-8J"
or something like that.
----------
components: Library (Lib)
messages: 66843
nosy: phr
severity: normal
status: open
title: add coded for java modified utf-8
versions: Python 2.5
__________________________________
Tracker <report@bugs.python.org>
<http://bugs.python.org/issue2857>
__________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com