Defines descriptors, summarizes the protocol, and shows how descriptors are
called. Examines a custom descriptor and several built-in python descriptors
including functions, properties, static methods, and class methods. Shows how
each works by giving a pure Python equivalent and a sample application.

Learning about descriptors not only provides access to a larger toolset, it
creates a deeper understanding of how Python works and an appreciation for the
elegance of its design.

In general, a descriptor is an object attribute with “binding behavior”, one
whose attribute access has been overridden by methods in the descriptor
protocol. Those methods are __get__(), __set__(), and
__delete__(). If any of those methods are defined for an object, it is
said to be a descriptor.

The default behavior for attribute access is to get, set, or delete the
attribute from an object’s dictionary. For instance, a.x has a lookup chain
starting with a.__dict__['x'], then type(a).__dict__['x'], and
continuing through the base classes of type(a) excluding metaclasses. If the
looked-up value is an object defining one of the descriptor methods, then Python
may override the default behavior and invoke the descriptor method instead.
Where this occurs in the precedence chain depends on which descriptor methods
were defined. Note that descriptors are only invoked for new style objects or
classes (a class is new style if it inherits from object or
type).

Descriptors are a powerful, general purpose protocol. They are the mechanism
behind properties, methods, static methods, class methods, and super().
They are used throughout Python itself to implement the new style classes
introduced in version 2.2. Descriptors simplify the underlying C-code and offer
a flexible set of new tools for everyday Python programs.

That is all there is to it. Define any of these methods and an object is
considered a descriptor and can override default behavior upon being looked up
as an attribute.

If an object defines both __get__() and __set__(), it is considered
a data descriptor. Descriptors that only define __get__() are called
non-data descriptors (they are typically used for methods but other uses are
possible).

Data and non-data descriptors differ in how overrides are calculated with
respect to entries in an instance’s dictionary. If an instance’s dictionary
has an entry with the same name as a data descriptor, the data descriptor
takes precedence. If an instance’s dictionary has an entry with the same
name as a non-data descriptor, the dictionary entry takes precedence.

A descriptor can be called directly by its method name. For example,
d.__get__(obj).

Alternatively, it is more common for a descriptor to be invoked automatically
upon attribute access. For example, obj.d looks up d in the dictionary
of obj. If d defines the method __get__(), then d.__get__(obj)
is invoked according to the precedence rules listed below.

The details of invocation depend on whether obj is an object or a class.
Either way, descriptors only work for new style objects and classes. A class is
new style if it is a subclass of object.

For objects, the machinery is in object.__getattribute__() which
transforms b.x into type(b).__dict__['x'].__get__(b,type(b)). The
implementation works through a precedence chain that gives data descriptors
priority over instance variables, instance variables priority over non-data
descriptors, and assigns lowest priority to __getattr__() if provided. The
full C implementation can be found in PyObject_GenericGetAttr() in
Objects/object.c.

For classes, the machinery is in type.__getattribute__() which transforms
B.x into B.__dict__['x'].__get__(None,B). In pure Python, it looks
like:

The object returned by super() also has a custom __getattribute__()
method for invoking descriptors. The call super(B,obj).m() searches
obj.__class__.__mro__ for the base class A immediately following B
and then returns A.__dict__['m'].__get__(obj,A). If not a descriptor,
m is returned unchanged. If not in the dictionary, m reverts to a
search using object.__getattribute__().

Note, in Python 2.2, super(B,obj).m() would only invoke __get__() if
m was a data descriptor. In Python 2.3, non-data descriptors also get
invoked unless an old-style class is involved. The implementation details are
in super_getattro() in
Objects/typeobject.c
and a pure Python equivalent can be found in Guido’s Tutorial.

The details above show that the mechanism for descriptors is embedded in the
__getattribute__() methods for object, type, and
super(). Classes inherit this machinery when they derive from
object or if they have a meta-class providing similar functionality.
Likewise, classes can turn-off descriptor invocation by overriding
__getattribute__().

The following code creates a class whose objects are data descriptors which
print a message for each get or set. Overriding __getattribute__() is
alternate approach that could do this for every attribute. However, this
descriptor is useful for monitoring just a few chosen attributes:

The protocol is simple and offers exciting possibilities. Several use cases are
so common that they have been packaged into individual function calls.
Properties, bound and unbound methods, static methods, and class methods are all
based on the descriptor protocol.

The property() builtin helps whenever a user interface has granted
attribute access and then subsequent changes require the intervention of a
method.

For instance, a spreadsheet class may grant access to a cell value through
Cell('b10').value. Subsequent improvements to the program require the cell
to be recalculated on every access; however, the programmer does not want to
affect existing client code accessing the attribute directly. The solution is
to wrap access to the value attribute in a property data descriptor:

Python’s object oriented features are built upon a function based environment.
Using non-data descriptors, the two are merged seamlessly.

Class dictionaries store methods as functions. In a class definition, methods
are written using def and lambda, the usual tools for
creating functions. The only difference from regular functions is that the
first argument is reserved for the object instance. By Python convention, the
instance reference is called self but may be called this or any other
variable name.

To support method calls, functions include the __get__() method for
binding methods during attribute access. This means that all functions are
non-data descriptors which return bound or unbound methods depending whether
they are invoked from an object or a class. In pure python, it works like
this:

The output suggests that bound and unbound methods are two different types.
While they could have been implemented that way, the actual C implementation of
PyMethod_Type in
Objects/classobject.c
is a single object with two different representations depending on whether the
im_self field is set or is NULL (the C equivalent of None).

Likewise, the effects of calling a method object depend on the im_self
field. If set (meaning bound), the original function (stored in the
im_func field) is called as expected with the first argument set to the
instance. If unbound, all of the arguments are passed unchanged to the original
function. The actual C implementation of instancemethod_call() is only
slightly more complex in that it includes some type checking.

Non-data descriptors provide a simple mechanism for variations on the usual
patterns of binding functions into methods.

To recap, functions have a __get__() method so that they can be converted
to a method when accessed as attributes. The non-data descriptor transforms a
obj.f(*args) call into f(obj,*args). Calling klass.f(*args)
becomes f(*args).

This chart summarizes the binding and its two most useful variants:

Transformation

Called from an
Object

Called from a
Class

function

f(obj, *args)

f(*args)

staticmethod

f(*args)

f(*args)

classmethod

f(type(obj), *args)

f(klass, *args)

Static methods return the underlying function without changes. Calling either
c.f or C.f is the equivalent of a direct lookup into
object.__getattribute__(c,"f") or object.__getattribute__(C,"f"). As a
result, the function becomes identically accessible from either an object or a
class.

Good candidates for static methods are methods that do not reference the
self variable.

For instance, a statistics package may include a container class for
experimental data. The class provides normal methods for computing the average,
mean, median, and other descriptive statistics that depend on the data. However,
there may be useful functions which are conceptually related but do not depend
on the data. For instance, erf(x) is handy conversion routine that comes up
in statistical work but does not directly depend on a particular dataset.
It can be called either from an object or the class: s.erf(1.5)-->.9332 or
Sample.erf(1.5)-->.9332.

Since staticmethods return the underlying function with no changes, the example
calls are unexciting:

This behavior is useful whenever the function only needs to have a class
reference and does not care about any underlying data. One use for classmethods
is to create alternate class constructors. In Python 2.3, the classmethod
dict.fromkeys() creates a new dictionary from a list of keys. The pure
Python equivalent is: