The Zen of Python

What is the Zen of Python? To find out, enter

>>>import this

at the Python interpreter prompt. (This is an Easter egg.) You will see the following:

The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

A simple main function

Note: Python is usually installed under /usr/local/bin/python, so it is also possible to start with

!/usr/local/bin/python

/usr/bin/env python searches $PATH for python and runs it.

Checking if a variable is a number (or number string)

Numbers (ints, floats, etc.) have a special __int__ method, so we can simply check if it exists:

def isNumber(x):

returnhasattr(x, '__int__')

This will return True for 5, -5 and 5.0, for instance. But what if we have a string representation? We are going to get False for each of "5", "-5" and "5.0". These are not numbers, they are strings.

What if we want to return True for strings representing numbers? We can use the following:

def isNumberOrNumberString(x):

if isNumber(x): returnTrue

try:

int(x)

exceptValueError:

try:

float(x)

exceptValueError:

returnFalse

returnTrue

repr() versus str()

The difference between repr() and str() in Python may not be immediately apparent.

According to the documentation, repr() returns the "official" string representation of an object. "If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned." In general, repr(o) should return a string representation of o such that the identity

o == eval(repr(o))

holds. eval() takes an (official) string representation of an object and returns a copy of that object constructed from this string representation.

On the other hand, str() returns an "informal" string representation of an object. "This differs from repr() in that it does not have to be a valid Python expression: a more convenient or concise representation may be used instead." In general, this representation should be human readable. There is no requirement for the identity

o == eval(str(o))

to hold.

Let us look at a few examples.

printstr("Paul's test string")

prints

Paul's test string

while

printrepr("Paul's test string")

prints

"Paul's test string"

The latter is a valid Python expression, the former is not.

printstr(1.0 / 3.0)

prints

0.333333333333

while

printstr(1.0 / 3.0)

prints

0.33333333333333331

The latter attempts to give enough decimal figures to enable the value to be reconstructed to maximum precision.

It's a bit surprising that

printstr([3, "paul's test string", 5.5, "bar", 7, 1.0 / 3.0])

and

printrepr([3, "paul's test string", 5.5, "bar", 7, 1.0 / 3.0])

both print

[3, "paul's test string", 5.5, 'bar', 7, 0.33333333333333331]

on ActivePython 2.5.2.2. It looks like str() for lists is implemented by calling repr() iteratively on the elements. (Shouldn't it be calling str()?)

Finally, for user-defined classes, repr() calls the __repr()__ method, while str() calls the __str()__ method. Here is an implementation of a simple class that provides both __repr()__ and __str()__ and conforms to the requirements imposed by the documentation:

class Point:

def__init__(self, x, y):

self.x = x

self.y = y

def__eq__(self, other):

ifhasattr(other, "x")andhasattr(other, "y"):

return(self.x == other.x)and(self.y == other.y)

else:

returnFalse

def__ne__(self, other):

returnnotself.__eq__(other)

def__str__(self):

return"(%s, %s)"%(str(self.x), str(self.y))

def__repr__(self):

return"Point(%s, %s)"%(repr(self.x), repr(self.y))

Thus

pt = Point(3, 5)

print pt

printstr(pt)

printrepr(pt)

printeval(str(pt)) == pt

printeval(repr(pt)) == pt

prints

(3, 5)

(3, 5)

Point(3, 5)

False

True

The first two lines are identical because print calls __repr__ when passed an object as its parameter. Notice that the result of repr(pt) can be used to reconstruct the Point object with eval().

Overridable properties in Python

class Foo(object):

_a = 7

def get_a(self):

returnself._a

def set_a(self, a):

self._a = a

A = property(fget=get_a, fset=set_a)

class Bar(Foo):

_newA = 5

def get_a(self):

returnself._newA

def set_a(self, a):

self._newA = a

f = Foo()

print f.A

b = Bar()

print b.A

If Foo.get_a is overridden by Bar.get_a we would expect to see the output

7
5

But instead we see

7
7

This is because in line

A = property(fget=get_a, fset=set_a)

the binding occurs pretty early and fget, fset are bound to A.get_a and A.set_a early, for good.

However, Python enables one to create overridable properties. The following implementation does the trick:

class OProperty(object):

"""Based on the emulation of PyProperty_Type() in Objects/descrobject.c"""

In other words, we have pre-filtered tradeSigns and ignored its elements equal to 1. Thus we skipped the 1's and obtained five elements in the resulting tradeDirections, rather than eight.

We could also do this:

tradeDirections = ["Sell"if ts == -1else"Buy"for ts in tradeSigns]

In this case tradeDirections is set to

['Sell', 'Buy', 'Buy', 'Sell', 'Sell', 'Sell', 'Buy', 'Sell']

perhaps in line with our original intentions. We didn't pre-filter tradeSigns and processed all its elements (this we get eight elements in the result) but chose to replace the -1's with "Sell" and the 1's with "Buy".

In each case we used if but resorted to different syntax.

Filtering one list by another

Suppose you have defined

names = ["Paul", "Alex", "John", "Simon", "Paul", "Michael"]

surnames = ["Smith", "Jones", "Taylor", "Williams", "Brown", "Green"]

and now you want to print out the surnames of all Pauls. This can be achieved by using list comprehensions:

print[surnames[i]for i inrange(len(names))if names[i] == "Paul"]

will produce the output

['Smith', 'Brown']

Checking if an object is a sequence or is iterable

If o is your object, you can use the following check:

ifhasattr(o, "__iter__"):
# ...

The following code

printhasattr(5, "__iter__")

printhasattr([1, 2, 3, 4, 5], "__iter__")

printhasattr([5], "__iter__")

printhasattr((5), "__iter__")

printhasattr((5,), "__iter__")

printhasattr((3, 2), "__iter__")

printhasattr("asdf", "__iter__")

prints

False
True
True
False
True
True
False

Implementing functors in Python

Any object with a __call()__ method may be called using the function call syntax:

class Scale(object):

def__init__(self, factor):

self.factor = factor

def__call__(self, arg):

returnself.factor* arg

s = Scale(2)

print s(5)

The functor can have more than one argument:

importmath

class Pythagoras(object):

def__init__(self):

pass

def__call__(self, arg1, arg2):

returnmath.sqrt(arg1 * arg1 + arg2 * arg2)

p = Pythagoras()

print p(3, 4)

Local variables in lambda expressions

We see that x + y is calculated twice in the following lambda expression:

func1 = lambda x, y, z: (x + y + z) / (x + y - z)

Can we compute it once and make it a local variable? One solution is to use a helper lambda expression:

func2 = lambda x, y, z: (lambdasum=x + y: (sum + z) / (sum - z))()

Now both

print func1(3.0, 5.0, 7.0)

and

print func2(3.0, 5.0, 7.0)

print the same number:

15.0

Instantiating a Python object dynamically by object class name

Use eval:

def forname(modname, classname):

''' Returns a class of "classname" from module "modname". '''

module = __import__(modname)

classobj = getattr(module, classname)

return classobj

class Foo(object):

def introduction(self):

print"I am FOO"

class Bar(object):

def introduction(self):

print"I am BAR"

className = "Foo"

o = eval("%s()"% className)

o.introduction()

This will print

I am FOO

If, on the other hand, you set className to "Bar", you will see

I am BAR

Sending the output to STDERR rather than STDOUT

Instead of

print"Hello"

use

importsys

sys.stderr.write("Hello\n")

Making a path, rather than just making a directory

If we try the following

importos

os.mkdir("foo/bar/baz")

while foo/bar does not exist, foo/bar/baz will never be made. Depending on the operating system, we may see something like

Reading a text file backwards

Reading a text file backwards is a relatively common task. Let me explain first what I mean by backwards: you read the file line by line, starting from the last line and progressing towards the first.

Why would you need this? Imagine that you have a large CSV (comma separated value) with numerous records sorted in ascending order by date/time. You want to read the last N records. Using the standard text file input/output machinery you would probably end up reading the entire file, discarding all but the last N records. Extremely wasteful. Chances are you will have more than one such file.

Offsetting a multiline string with spaces

Suppose that you need to offset a multiline string with a given number of spaces. You can use the regular expression "^" to match the beginning of the line (to match the beginning of each line in a multiline string, you must use the re.MULTILINE modifier):

lineStartRegEx = re.compile("^", re.MULTILINE)

def offsetWithSpaces(s, spaceOffset):

return lineStartRegEx.sub(" "* spaceOffset, s)

Then

s = "foo\n bar\nbaz"

print offsetWithSpaces(s, 4)

produces

foo
bar
baz

Chomping in Python

Perl users will be familiar with the chomp function which removes the trailing newline character if it's present. They will use it like so:

What is IronPython?

IronPython is an implementation of the Python programming language running under .NET and Silverlight. It supports an interactive console with fully dynamic compilation. It's well integrated with the rest of the .NET Framework and makes all .NET libraries easily available to Python programmers, while maintaining compatibility with the Python language.

IronPython is an open source project freely available under the Microsoft Public License.

However, this is not recommended: "It can also be used to restore the actual files to known working file objects in case they have been overwritten with a broken object. However, the preferred way to do this is to explicitly save the previous stream before replacing it, and restore the saved object" (14th October, 2009).

Thus our original approach is preferred. However, there is a caveat: "Changing [sys.stdout] doesn’t affect the standard I/O streams of processes executed by os.popen(), os.system() or the exec*() family of functions in the os module."

Thus

importsys

originalStdout = sys.stdout

sys.stdout = open("mystdout.txt", "a")

someFunction()

sys.stdout = originalStdout

If inside someFunction() we execute an external process using the aforementioned functions or even use some code from Windows DLLs, that external code will be using the original STDOUT. Our redirect won't work for it.

On Windows, we could use the win32api functions to resolve this problem:

mystdout.log will be closed when o goes out of scope, before you get a chance to call someFunction(). So you need to make sure that o doesn't go out of scope before the time is ripe.

However, when we redirect the "system" STDOUT and STDERR as described here, the "Python" STDOUT and STDERR won't change. Thus we need to redirect both. If we redirect them to the same file, we need to make sure that we set win32file.FILE_SHARE_READ | win32file.FILE_SHARE_WRITE (or else the file will be locked). Moreover, we can append to the end of file using win32file.SetFilePointer. Putting this all together, we obtain the following:

Thus, when you import mymodule.py, the function fooBar will be bound to either fooBar_zlib or fooBar_noZlib depending on whether the module zlib is present on your system or not. As a user of mymodule.py you don't have to know exactly which implementation is being used. The logic which determines which of the two implementations to use is encapsulated in mymodule.py.

Define functions conditionally on the operating system

A similar idea can be used to define functions conditionally on the operating system.

If we are running on Linux or Unix, os.name should be "posix". If we are running on Windows, os.name should be "nt". So we could do the following