An Interesting Fact About The Python Garbage Collector

Feb 16, 2016

While Python prides itsself of being a simple, straightforward
programming language and being explicit is pointed out as a core value,
of course, one can always discover interpreter specifics and
implementation detail, that one did not expect to find when working at
the surface. These days I learned more about a peculiar property of the
Python garbage collector, that I would like to share.

Let’s start by introducing the problem quickly. Python manages its
objects primarily by reference counting. I.e. each object stores how
many times it is referenced from other places, and this reference count
is updated over the runtime of the program. If the reference count drops
to zero, the object cannot be reached by the Python code anymore, and
the memory can be freed/reused by the interpreter.

An optional method __del__ is called by the Python interpreter when
the object is about to be destroyed. This allows us to do some cleanup,
for example closing database connections, etc. Typically __del__
rarely has to be defined. For our example we will use it to illustrate
when the disposal of an object happens:

Setting a to None, we will still have refcounts of >= 1. For these
cases, Python employs a garbage collector, some code that traverses
memory and applies more complicated heuristics to discover unused
objects. We can use the gc module to manually trigger a garbage
collection run.

However, since A implements __del__, Python refuses to clean them,
arguing that it cannot not tell, which __del__ method to call first.
Instead of doing the wrong thing (invoking them in the wrong sequence),
Python decides to rather do nothing – avoiding undefined behaviour, but
introducing a potential memory leak.

In fact, Python will not clean any objects in the cycle, which can
possibly render a huger group of objects to pollute memory (see
https://docs.python.org/2/library/gc.html#gc.garbage ). We can inspect
the list of objects, which could not be garbage collected:

Finally, if you remove the __del__ method from the class, you would
not find these objects in gc.garbage, as Python would just dispose of
them.

Python 3

As it turns out, from Python 3.4 on, the issue I wrote about does not
exist anymore. __del__ s do not impede garbage collection any more, so
gc.garbage will only be filled for other reasons. For details, you can
read PEP 442 and the
Python docs.

Considering the adoption of Python 3.4, most Python code bases have to
be careful about when to use __del__.