Monthly archive for January, 2014

Over the course of several past months and years I was coding in Python, I’ve created quite a few Python packages: both open source and for private projects. Even though their most important part was always the code, there are numerous additional files that are necessary for the package to correctly serve its purpose. Rather than part of the Python language, they are more closely related to the Python platform.

But if you look for any definite, systemic info about them, you will at best find some scattered pieces of knowledge in various unrelated places. At worst, the only guidance would come in the form of a multitude of existing Python package sources, available on GitHub and similar sites. Parroting them is certainly an option, although I believe it’s much more advantageous to acquire firm understanding of how those different cogs fit together. Without it, following the modern Python’s best development practices – which are all hugely beneficial – is largely impossible.

So, I want to fill this void by outlining the structure of a Python package, as completely as possible. You can follow it as a step-by-step guide when creating your next project. Or just skim through it to see what you’re missing, and whether it’d be worthwhile to address such gaps. Any additional element or file will usually provide some tangible benefit, but of course not every project requires all bells and whistles.

Without further ado, let’s see what’s necessary for a complete Python software bundle.

If you code in Python, then chances are that at some point, you have written a check similar to this one:

def function(arg):

ifnotisinstance(arg,basestring):

raiseTypeError("%r is not a string" % arg)

# ... rest of the function ...

Some would of course argue against putting such an explicit if in the code, insisting to rely on duck typing instead. But while this is an easy target of critique, it’s nowhere near the biggest problem you can find in the snippet above.

This code has a subtle bug. The bug is not even limited to checks like this one; it can occur in many different situations. It surfaces rarely, too, so it’s all the more surprising when it actually rears its ugly head.

The bug is related to string formatting, which in this case points to this expression:

"%r is not a string" % arg

Most of the time, it is perfectly fine and works flawlessly. But since arg is a value we do not have any control over, sometimes it may not work correctly. Sometimes, it can just blow the whole thing up, likely in a way we have not intended.

Wild tuple appears!

All it takes is for arg to be a tuple – any tuple. Tuples are special, because the string formatting operator (%) expects you’ll use them to pass more than one argument to fill in placeholders in the string:

def print_square(x):

print"%d ^ 2 = %d" % (x, x*x)

The construct of a string followed by percent sign, followed by parenthesis, is very likely familiar to you. Notice, however, that there is nothing exceptional about using a tuple literal: what is important is the tuple type. Indeed, we could rewrite the above in the following manner:

def print_square(x):

args =(x, x*x)

print"%d ^ 2 = %d" % args

and the end result would be exactly the same. The only reason we prefer the first version is its obviously superior readability.

Tuple uses Misformat! It’s super effective!

Comparing that last piece of code with the first one, we can see quite clearly how everything will go horribly wrong should we try to format the TypeError‘s message using arg which happens to be a tuple. Not just one, but three different failure modes are possible here:

empty tuple (too few arguments for string formatting)

tuple with at least 2 elements (too many arguments)

tuple with exactly one element

Last one is particularly jarring. It raises no exceptions on by itself, and can additionally result in confusing messages, along the lines of:

'Alice has a cat'isnot a string

Much head-scratching would probably ensue if you stumbled upon exception that reports something like this.

Tuple was caught!

To avoid these problems, one solution is to engage in some sort of pythonic homeopathy. As it turns out, we can cure the malady of tuples by adding even more tuples:

raiseTypeError("%r is not a string" % (arg,))

Through this weird (arg,) singleton (1-tuple), we are explicitly sidestepping the error-prone feature of % operator, where it allows a single right-hand side argument to be passed directly. Instead, we are always wrapping all the arguments in a tuple – yes, even if it means using the bizarre (1,) syntax. This way, we can fully control how many of arguments we actually give to the formatter, regardless of what they are and where did they come from.

It’s not pretty, I know – it adds some visual clutter. But the total alternative, the format method, is even more verbose and ridden with issues. C’est la vie.

There’s no better way to start a new year than a hearty, poignant rant. To set the bar up high right off the bat, I’m not gonna be picking on some usual, easy target like JavaScript or PHP. To the contrary, I will lash out on everyone-and-their-mother’s favorite language; the one sitting comfortably in the middle between established, mature, boring technologies of the enterprise; and the cutting edge, misbegotten phantasms of the GitHub generation.

That’s, of course, Python. A language so great that nearly all its flaws people talk about are shared evenly by others of its kin: the highly dynamic, interpreted languages. One would be hard-pressed to find anything that’s not simply a rehashed argument about maintainability, runtime safety or efficiency – concerns that apply equally well to Ruby, Perl and the like. What about anything specifically “pythonic”?…