Add a function like chr that always returns a byte string

Description

Python 2 chr returns an instance of str (byte string). Python 3 chr returns an instance of str (text string). There's no obvious polyglot expression of the Python 2 version of the behavior, so we should have an API that takes care of the differences between Python 2 and Python 3 for this behavior.

At least twisted.python.randbytes and twisted.names.dns want to use this.

Change History (18)

Here's a patch that creates such a chr function. It tries to be an exact duplicate of 2x's chr, right down to the error messages. I stuck it in its own file, chr_.py, under twisted/python. This also means there's a new file of tests, test_chr_.py, under twisted/python/test. I could also see putting it in _util_py3.py, until such time as _util_py_3 is merged back into util.

unichr is a closely related function. 3x's chr is a wide 2x build's unichr. The narrow 2x build's unichr has no equivilent. It would make sense to smooth this over as well, though in a separate ticket. Unichr is only used in one place in Twisted, twisted.words.protocols.jabber.xmpp_stringprep, so it's not very important.

ord also changed a bit; it used to behave differently between wide and narrow builds and that distinction is gone now. But that doesn't really have anything to do with Twisted, so ord is fine the way it is.

oh, I see it now. So personally I would test only 126 and 255 for positive testcases, and 125 and 254 for negative scenarios (and all other negative scenarios). Can you create one function and move " if _PY3: " inside the function (and also move doc inside)?

So personally I would test only 126 and 255 for positive testcases, and 125 and 254 for negative scenarios (and all other negative scenarios).

I don't know what you mean here. chr_ should return a bytestring containing the corresponding ASCII (okay latin1, I'm being sloppy) character for any integer in [0, 255] inclusive. I don't understand why we shouldn't be testing lower than 126, and I don't understand what you mean by testing for negative scenarios for 125 and 254.

chr_(125) should, and does, yield '}' in 2x and b'}' in 3x. And note that these are the same thing. Well, almost the same thing. Index into a bytestring in 2x and you get a 1 character byte string. Index into a bytestring in 3x and you get an integer. But smoothing THAT over is more than a measly chr() replacement can do.

Certainly there should be a negative result (a ValueError) for 256 and -1, and those checks are already in place. (Lines 565 and 566, _negativeIntegers and integersOver255.)

Incidentally, lower pane ASCII ends at the 128th character, that is, the character represented by the binary number 127. So the boundary is 127-128. Not 125-126.

Can you create one function and move " if _PY3: " inside the function (and also move doc inside)?

I've been told in the past to perform the 3x check outside of functions (​http://twistedmatrix.com/trac/ticket/5897#comment:4), so I did the same here. In addition, this style is consistent with the definitions of iterbytes and a few other functions in twisted.python.compat, just above the definition of chr_.

Personally I agree, it looks just a little bit odd.

Again, I appreciate the feedback, especially since it was so surprisingly prompt :)!

Sorry for creating mess with the range ends. So if the range is 0-255 I would check just positive scenarios for beggining of range, maybe middle of the range and end of range (255) to create more compact tests. Not to check same case multiple times. But I guess it's just minor issue we can leave it as it is now.

Hey :) Thanks for your patch, looks pretty good to me so far. Some questions I had reading it:

What's the purpose of the step size of 1 in the range for _allASCIICharacters? Does it do anything special vs just omitting it that I'm forgetting?

Why is this chr_ instead of chr? At first I thought it may be to avoid stomping over existing names, but the rest of the compat module appears to have no qualms with that, Secondly I thought Py3 might've added a prohibition on renaming certain things like it does for None, True and False, but that's not the case, at least not in my version of 3.3 or 3.4a0.