On 29.08.2014 02:41, Stephen J. Turnbull wrote:
> In the process of booking up for my other post in this thread, I
> noticed the 'surrogatepass' handler.
>> Is there a real use case for the 'surrogatepass' error handler? It
> seems like a horrible break in the abstraction. IMHO, if there's a
> need, the application should handle this. Python shouldn't provide
> it on encoding as the resulting streams are not Unicode conformant,
> nor on decoding UTF-16, as conversion of surrogate pairs is a
> requirement of all Unicode versions since about 1995.
This error handler allows applications to reactivate the Python 2
style behavior of the UTF codecs in Python 3, which allow reading
lone surrogates on input.
Since Python allows working with lone surrogates in Unicode (they
are valid code points) and we're using UTF-8 for marshal, we needed
a way to make sure that Python 3 also optionally supports working
with lone surrogates in such UTF-8 streams (nowadays called CESU-8:
http://en.wikipedia.org/wiki/CESU-8).
See
http://bugs.python.org/issue3672http://bugs.python.org/issue12892
for discussions.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Aug 29 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/________________________________________________________________________
2014-08-27: Released eGenix PyRun 2.0.1 ... http://egenix.com/go62
2014-09-19: PyCon UK 2014, Coventry, UK ... 21 days to go
2014-09-27: PyDDF Sprint 2014 ... 29 days to go
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/