Present Perfect

2009-11-285:02 pm

I usually tend to think of Python as the discerning gentleman’s programming language: well-behaved, well-documented, people take care of the code written. I like the batteries-included approach and assume that the battery code in the standard library is well-written. “import this” is a vision statement directly included in the language – it’s hard to get more stylish than that.

I got an eye-opener this weekend however. I was still on my quest to get desktopcouch and ubuntuone working on Fedora. While wresting with this bug and doing things that I usually consider a hanging offense (changing /usr-installed code by adding prints to figure out where the craziness was coming from) I finally drilled down to the exception-raising reason. It all boiled down to a single line of code in httplib.py:
def __init__(self, host, port=None, key_file=None, cert_file=None,
strict=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):

where socket.py contains
_GLOBAL_DEFAULT_TIMEOUT = object()

So, in a nutshell, httplib2.py tracebacks because of this new object, which isn’t a valid argument to sock.settimeout()

Now, I’m pretty sure I’m running into this problem because I’m doing “bad” things to some of the stdlib, pulling in bits I need to make ubuntuone (coded on a 2.6.3 python where my Fedora 11 comes with 2.6) work.

But pulling the cover off like this did point out this one object that:

seems to be intended to be private, but it gets referenced from other stdlib modules

comes with no documentation at all

comes with not a single comment explaining *why* it’s there, or *why* it’s ok to just create a completely empty and useless “object” that you can’t even trace the origin of (I had to override __setattr__ on some class to figure out what the anonymous object was, and where it was being set from, to find it)

Maybe I’m oldfashioned, but this leaves me disappointed. This one line breaks beauty, explicitness, and readability that is included in the Zen of Python.

Creating an empty object is a not-so-uncommon way of getting hold of a unique sentinel value.

If a module has “foo = object()”, the rest of the code can let “foo” be a default value for function arguments. Inside the function one can check if “arg is foo” to see if “arg” was left to its default value (foo). This is necessary if you want to differentiate between people passing no argument and passing the normal value for “no argument” namely None.

I use python at work quite a bit. It’s a very good language in many ways. But as far as library handling goes the language is rather the ugly duckling. There’s just too many issues around library versioning, installing and general handling. Far too often an app will fail to work because libraries aren’t installed where the app can find them or there’s some incompatibility between minor python or library versions, or python fails to adapt gracefully to being installed in different ways by different distributions.

It doesn’t need to be this way, and it can be done better. Ruby and Perl are both much better behaved about this and fail far less. I don’t know if it’s a deficiency in python system’s own ability to adapt properly to changes, or if it’s just not giving library and app writers the tools to effortlessly do the right thing. But this is a visible and unnecessary blemish on an otherwise quite beautiful language.

Creating just an object like that and then using it as a default parameter is a common and perfectly acceptable way to track if a parameter was passed or not, when None is an acceptable value to pass in.

It’s used in this code:
if timeout is not _GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
And since None is an acceptable parameter to settimout, that means None is an acceptable parameter to create_connection. Therefore the default needs to be something else. So a marker object is createt and used.

Why is it referenced from other parts? Well, because they also want a timeout parameter. So:

@Lennart: I understand the goal, but the problem is with the style. First, creating a general object() makes it impossible to trace what object it is, where it is created, and what purpose it serves. Second, if it starts with an underscore it’s clearly intended to be private. So there is a mismatch in purpose of intent on the author’s part: a ‘constant’ to be used by various modules should not be started with an underscore and left without documentation. It’s quite simply laziness on the author’s part, almost suggesting that he put in an underscore because he couldn’t be bothered to write a comment or docs and get away with it since he can later handwave and say ‘this was private’.

Consenting adults works fine when people accept their responsibility. I doubht the author did in this case.

[…] whom I have never met but works in some of the areas of Linux that I used to be involved with, writes about some ugly code he found in the Python standard libraries, and says I usually tend to think of […]

Using a marker object is not that uncommon, although it can be done better. In Storm, we use an object called “Undef” for this purpose that has a repr() of “Undef”. That would seem to cover the debugging aspects of your complaint.