On language, setup, environment

(Major) Python implementations

CPython is the C implementation of Python, is the usualy implementation used as a system Python, and is also the reference implemenation of Python.

Jython implements python in Java. Apparently it is only slightly slower than CPython, and it brings in the java standard library to be used from python code, though lose C extensions(verify).

IronPython compiles python to IL, to run it on the .NET VM.
It performs similarly to CPython (some things are slower, a few things faster, even) but like any other .NET language, you get .NET interaction.

You lose the direct use of C extensions (unless you have fun with C++/CLI), though .NET itself often has some other library to the same effect.

Python for .NET is different from IronPython in that it does not produde IL or run on the .NET VM, but is actually a managed C interface to CPython(verify) (which also seems to work on Mono).

While somewhat hairier than IronPython, it means you can continue to use C extensions, as well as interact with .NET libraries; the .NET library can be directly imported, and you can load assemblies.

There is also PyPy [1][2], which is an implementation of python in python. It seems this was originally for language hacking and such (since it's easier to implement mucking with Python rather than in C), but it seems to now be a good JIT compiler (relying for a good part on RPython, a subset of Python that can be statically compiled) that can give speed improvements similar to the now-aging psyco.

Help / documentation

An pre-code(verify) unassigned string at module, class or function level is interpreted as a docstring (stored in its __doc__ attribute).

Docstrings will show up in documentation that can be automatically generated based on just about anything. For example:

compile(pattern, flags=0)
Compile a regular expression pattern, returning a pattern object.
escape(pattern)
Escape all non-alphanumeric characters in pattern.
findall(pattern, string, flags=0)
Return a list of all non-overlapping matches in the string.
If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern
has more than one group.
Empty matches are included in the result.
finditer(pattern, string, flags=0)
Return an iterator over all non-overlapping matches in the
string. For each match, the iterator returns a match object.

callable

Because you can call a function, method, class (or, technically, type),

More specifically, any instance with a __call__ method.

In many situations where you could pass a function, you can pass any callable, because most of the time all the backing code does is call the object.

To test whether something can be called, you could use callable() (a built-in).

If you wish to test for more specific cases (callable class? function? method?), you can use the inspect module (see its help() for more details than some html documentation out there seems to give).

singularity on top of immutability

Identity is compared with

is

which uses the built-in

id()

function.

Some things in python are singular (on top of being immutable), by design.
You could say this messes with the identity abstraction, but is primarily used to make life simpler, and generally does.

For example, you can test against types and None as if they are values, meaning you can use either is or == without having one of them mean something subtly but fatal-buggilly different.
In practice this seems better than having to know all the peculiarities of the typing system (if only because we tend to have to know several language's).

Numbers are immutable and singular.

Strings are immutable but not singular -- although there are cases where they seem to act that way, for example in string literals (are there further details?(verify)). For example:

Calling superclass methods, super()

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Firstly, the standard remark: if you're making inheritance diamonds, this is complexity any way you twist it, so is it necessary rather than happy-go-lucky class modelling?

If you use super(), it should be used consistently, when you know the potential problems and can explain to other people why it won't fail. (Read the two things linked to, or something like it)

Many argue that it's more understandable and less error-prone to handle superclass calls explicitly.
Since superclassing is effectively part of a class's external interface anyway (and so is super, if you use it), you might as well be explicit, rather than have it be hidden by implied semantics.

While more verbose than super(), it's easier to follow, less magical, but less fragile for later changes, and mistakes are probably easier to spot by now not coming from implicit behaviour. (you can argue about the fragility - yes, it will cause errors quickly when you change arguments, but that's arguably preferable over the alternative)

One assumption here is that class inheritance is used for eliminating redundancy in your own codebase, not for flexibility.

However, when writing things like mixins (or abstract classes or interfaces), you may still need to know all about super()

'call at python exit'

Avoid assigning a callable to sys.exitfunc yourself, since it can only have one function set, and you may break cleanup for other things when you use it (of course, could just make it a function that also calls what the function was previously set to, but there may be hairy details to that(verify))

Note that there is never a hard guarantee that the code will get run, considering things like segfaults, which Python itself should be pretty safe from, but isn't too hard to find in some C extensions.

Builtins

Built-ins are things accessible without specific imports. The following are the 2.4 built-ins, a mix of types and functions, roughly grouped by purpose.

dir, help

str, unicode (and their virtual superclass, basestring)

oct, hex, ord, chr, unichr

int, long, float, complex,

abs, round, divmod, min, max, pow

tuple, list

len, sum

filter, reduce, map, apply

zip

iter, enumerate

reversed, sorted

cmp

range, xrange

dict, intern

set, frozenset

bool

coerce

slice (used only for extended slicing - e.g. [10:0:-2])

buffer (a hackish type convenient to CPython extensions and some IO)

object

hash, id

repr Should give a description useful in debugging -- short, unambiguous description of the object that lets you distinguish it from others, and possibly gives more information. Supported by __repr__; if you write your own, signal that/when you may not be showing everything)

Shallow and deep copy

General shallow/deep copies are possible (on top of the basic reference assignment).

The following demonstration uses lists as a container, but this also applies to objects.
(This does not summarize real objects and mutable structures like lists, since they themself contain references, so the concept of copying such objects is ambiguous, which is why there is a distinction in shallow and deep copying.)

The shallow copy, b, is a new object (in this case a new list object), into which references to the objects the old collection are inserted.

The deep copy, c, creates not only a container object but also creates copies of the old contained objects to insert into the new container.

With objects, or structures that contain objects, you regularly want the latter.

Often enough the latter is what you mean to do when making a duplicate - but note that this creation only works when the creation of these objects does not have peculiar side effects or rely on administration data or object references that it wouldn't be used the same way in deep copying.

Such issues limits deep copy in any language. There are usually partial fixes, often in the form of some way to optionally override deep-copy behaviour with your own functionality via an interface.
Note that python's deepcopy does avoid circular recursion problems.

Note that for lists, a subrange selection like a[1:4] is effectively a partial copy - and

a[:]

a full one. Creating a new list from an old one is also behaviour you get from

list(a)

. Both of these are effectively shallow copiers.

(Syntax) Changes

New python features

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Functional-like things

Note on closures

See Closures for the basic concept. They were apparently introduced in py2.2.

Note that python closures are read-only (immutable), which means that trying to assign to a closured variable will actually mean creating a local variable with the same name. This works on function level, so even:

def f():
x=3def g():
y=x #...with the idea that this would create a local y
x=y # and this a local x...print x
g()
f()

...won't work; function variables are declared at function compilation time, so it means "declare x in g's scope", but then you try to use it before you assign something to it.

Lambda, map, filter, reduce

Lambda expressions are functions that take the form

lambda args:expression

, for example

lambda x: 2*x

. They must be single expressions and therefore cannot contain newlines and therefore no complex code, unless they call functions.

They can be useful in combination with e.g. map(), filter() and such. For example:

Map gives a new list that that comes from applying a function to every element of an input list/iterable. For example:

>>> map(lambda x:4*x, ['a',3])['aaaa', 12]

Filter creates a new list whose elements for which the function returns true. For example:

It has various uses, amongst which laziness, and even brevity/readability.
Consider:

map(lambda x: 2*x, [1,2,3,4,'q'])#arguably the following is more readable (though not much)(2*x for x in[1,2,3,4,'q'])

...though note that the former returns a list, the latter a generator. Both are iterable and fine in a context of iteration, but the latter e.g. can't be index-addressed.

In a few cases, the lists returned by list comprehensions may serve you better than the generators returned by generator expressions. For example, lists allow random access. On the other hand, for purely iteration, generator expressions of nontrivial size are more efficient and neat. Consider, for example:

max([x for x inxrange(10000000)])# Memory use spikes to ~150MBmax(list(x for x inxrange(10000000)))# (...because it is a shorthand for this)max( x for x inxrange(10000000))# Uses no perceptible memory (is a generator)

Note the use of xrange, a generator-based version of range(). There would be two problems if you used range(), since both the lines would then generate a huge list before being able to iterate over it, and you'ld win nothing.

Note: Don't use the profiler to evaluate xrange vs. range; it adds overhead to each function call, of which there will be ten million with the generator and only a few when using [] / list(). This function call is cheap in regular use, but not in a profiler.

This brings up the point that you can do this:

legal = (possMove for possMove in checkers.allMoves() if legal(possMove))

Assuming allMoves is also a generator function, this will cascade back next()s and is therefore a lazy execution implementation.

Consider constructions like one to indefinitely generate random numbers:

randoGen = (random.random() for i in itertools.repeat(0))

Of course, randoGen.next() more or less does what random.random() is defined to do, and there is no need to keep extra state, rending this particular example mostly pointless. With global references (perhaps closured generators?) I'll bet this can be useful.

About the syntax: it seems that any place there are already regular brackets (()) you don't have to add brackets for a generator expression.

Iterables

In general, iterators allow walking through iterables with minimal state, usually implemented by an index into a list or a hashmap's keys, or possibly a pointer if implemented in a lower-level language.

Iterations won't take it kindly when you change the data you are iterating, so something like:

a={1:1, 2:2, 3:3}for e in a:
a.pop(e)

...won't work. The usual solution is to build a list of things to delete and doing that after we're done iterating.

In python, most collection types are iterable on request. Iterating dicts implies their keys. I imagine this is based on __iter__.

Syntax things

lists

More things to do with lists...

range, xrange

>>> range(4)[0,1,2,3]
>>> range(2,4)[2,3]
>>> range(4,2,-1)[4,3]

xrange is a generator equivalent (it's a special case that for most intents and purposes acts like a generator, but isn't exactly) that gives the same list when you yield it. It is therefore safe in terms of memory to do xrange(400000000)(and not so clever to do range(400000000)).

Some more notes

This can be useful to make the actual call to a big initializer function forwards and backwards compatible: The call will not fail when you use new or old keywords.
You can choose to write your code ignoring, warning, and throwing error as is practical.

Without this, the call itself may fail when you though you were using a different version - and it's annoying when different versions of the same thing are not drop-in replacements.

One note related to defaults (not python-specific):

It sometimes makes sense to make a function react with a default by passing a value like None rather than specifying the default value in the function definition - pieces of pass-through code that may or may not take a user/config value cannot easily rely on a function-definition default using **kwargs (with the value explicitly removed) in the call, or even if-thenning with slightly different calls of the same function (ew).

Member-absence robustness

More than once I've wanted to check whether an object has a particular member -- without accessing it directly, since that would throw a TypeError if it wasn't there.

One way is to use the built-in function

hasattr(object,name)

(which is just a getattr that catches exceptions).
Another is to do

'membername' in dir(obj)

.

If you want to get the value and fall back on a default if the member wasn't there, you can use the built-in

getattr(object,name[,default])

, which is allows you to specify a default to return instead of throwing a TypeError.

OO

Accessors (property)

What are known as accessors in other languages can be done in python too, by overriding attributes, largely syntactic sugar for creating the function.

The specific property() function a function that serves approximately the same purpose as, say, C#'s attribute syntax. Example:

The signature is property(fget, fset, fdel, doc), all of which are None by default and assignable by keyword.

You should use new-style objects (the class has to inherit from object); without the inheritance you make a more minimalistic class in which the above would use members instead(verify).

these won't show up in dir()s - they're not object members (which allows you to create shadow members and do other funky things)

Note that this indirection makes this slower than real attributes

self

Python doesn't hide the fact that class functions pass/expect a reference to the object they are working on. This takes getting used to if you've never used classes this way before.

Class functions must be declared with 'self', and must be called with an instance. This makes it clear to both python and coders whether you're calling a class function or not.
Consider:

def f():
print'non-class'class c(object):
def__init__(self):
f()# refers to the function above. prints 'non-class'self.f()#refers to the function. Prints 'class', and the object#note that self-calls imply adding self as the first parameter.def f(self):
print'class; '+str(self)def g(): #uncallable - see note belowprint'non-class function in class'
o=c()
c.f(o)#c(f) would fail; no object to work on

You can mess with this, but it's rarely worth the trouble.

Technically, you can add non-class functions to a class, for example g() above. However, you can't call it. self.g() and o.g() fails because python adds self as the first argument, which always fails because g() takes no argument.
It does this based on metadata on the function itself -- it remembers that it's an class method, so even h=o.g; h() won't work.

You can stick non-class functions onto the object after the fact, but if you do this in a class-naive way, python considers these as non-class functions and will not add self automatically.
You can call these, but you have to remember they're called differently.

Details to objects, classes,and types

New-style vs. classic/old-style classes

New-style classes ware introduced in py2.2, partly to make typing make more intuitive sense by making classes types.

Old-style classes stayed the default (interpretation for classes defined like class Name: and not class Name(object):) up to(verify) and excluding python 3. Py3k removed old-style classes completely and made what was previously known as new-style behaviour the only behaviour.

When the distinction is there,new-style classes are subclasses of the object class, indirectly by subclassing another new-style object or built-in type, or directly by subclassing object:

class Name(object):
pass#things.

Differences:

old-style objects will report being of type() instance, new-style objects will report their class name (__class__)

__divmod__: returns a tuple, (the result of __floordiv__ and that of __mod__)

__pow__: for **

__lshift__: for <<

__rshift__: for >>

__xor__,: for ^

__invert__: for ~

Special-purpose:

__all__: a list of names that should be considered public(verify) (useful e.g. to avoid exposing indirect imports)

__nonzero__: used in truth value testing

__i*__ variants

All the in-place variations (things like +=) are represented too, by __isomething__. For example: __iadd__ (+=), __ipow__ (**=), __irshift__(<<=), __ixor__ (^=), __ior__ (|=) and so on.

__r*__ variants

the __rsomething__ are variations with swapped operands. Consider x-y. This would normally be evaluated as

x.__sub__(y)

If that operion isn't supported, python looks at at whether y has a __rsub__, and looks whether it can instead evaluate as:

y.__rsub__(x)

The obvious implementation of both makes their evaluation equivalent. This allows you to define manual types which can be used on both sides of each binary operator and do so non-redundantly, and with a self-contained definition.

(Only for binary operator use: doesn't apply to the unary - or +, or the ternary **)

__getattr__ and __getattribute__

For old-style classes, if normal member access doesn't find anything, the __getattr__ is called instead.

In new-style classes, __getattribute__(self, name) is used for all attribute access. __getattr__ will only be called unless __getattribute__ raises an AttributeError (and, obviously, __getattr__ is defined)

Since __getattribute__it is used unconditionally, it is possible to create infinite loops when you access members on self in the self.name style. This is avoided by explicitly using the base class' __getattribute__ for that.