Last month we started to work on support for Python projects in gemnasium.com.
The goal is simple: being able to track the dependencies of any Python project, like we do with Ruby and Node.js ones.
This is just yet another packaging system, right? Well, not quite. There are some gotchas.
And there’s a story to tell.

Early days: Distutils

Distutils
is the first well-established distribution system for Python.
I don’t know how old it is,
but it seems that it has been around forever.

The whole purpose of Distutils is to distribute Python modules.
These can be “pure modules” written Python or “extension modules” written in C/C++.

Among many things, Distutils is able to
create a source distribution
for a Python project.
The result is a self-contained zip or gzip archive, like a Rubygem.
Here is how to build the distribution:

$ python setup.py sdist

Distutils requires that you create a setup.py script for your project.
According to the documentation, the bare minimum is:

The documentation states that it’s possible to define
relationships between distributions and packages.
Dependencies on other Python packages are specified by
“supplying the requires keyword argument to setup()”.
Each dependency comes with a requirement.
So this should work:

Middle age: Setuptools

So Setuptools filled the gap and brought real dependency management to Python. Great!

Whenever possible, Setuptools tries to be a drop-in replacement for Distutils.
This means that Setuptools also rely on a setup.py script.
But it extends it with new keywords
to setup() and even has special ones for dependency management:

install_requires

setup_requires

tests_require

extras_require

So we are now able to declare the packages that we need to install, setup, test, etc.

The syntax is the same as it was for the requires keywords.
But requires (from Distutils) is not used anymore.
By the way, Setuptools comes with some documentation about
versioning
and
declaring the dependencies.

Let’s consider the shootout sample application
from the Pylons.
setup.py takes care of building the egg:

By the way, this is how EasyInstall explores the dependencies of a package.

What about Distribute?

By the way, you may have heard about Distribute, a fork of the Setuptools project.
But don’t worry about it since Setuptools and Distribute have merged.
This means that the latest version of Distribute is now a wrapper around Setuptools.
Have a look at the Merge FAQ
if you are wondering about the consequences.

Modern ages: pip

And then came pip.
It extends EasyInstall
in many ways.
Here are some features:

packages are downloaded before installation

support for various control systems, like Git

simple to define fixed sets of requirements and reliably reproduce a set of packages

At first sight pip install looks like easy_install
with more options.
But pip goes a step further with this killer feature:

$ pip install -r requirements.txt

It installs everything according to the requirements described in a text file.
The requirement file
look like the requires.txt metadata file from the Python Egg format:

MyApp
Framework==0.9.4
Library>=0.2

But the requirements are now stored in a separate text file one can easily review, edit and store.
It’s not generated by some Python script anymore.

OK, but how do we get this requirements.txt file?
The workflow is simple:

install the dependencies for your project

freeze the requirements to a file using pip freeze

But this means we need some tool like virtualenv
to create and isolate Python environments.
We don’t want to mess with the dependencies, do we?

By the way, the approach is somewhat familiar to what bundler
does in the Ruby land:

bundler creates an isolated environment for your project

it knows the exact versions of the packages you have installed

so it is able to reinstall the exact same version of each package

If you are curious about Ruby, I suggest you have a look to
Yehuda Katz’s blog post
that gives the best practices to manage the dependencies using
Bundler and its Gemfile.

The PaaS Era

Pip has been there for a while but it became irreplaceable:
if you deploy using some PaaS
chances are that you need pip requirement file
to control the environment where your web application runs into.

For instance, Heroku has Python support
and it leverages pip
and its requirements.txt to provision the servers
with the dependencies of the web application you deploy.

Wrapping up

So pip is the new thing, and we don’t have to declare the dependencies into setup.py anymore, right?
Well, it depends on what you are looking for.

The pip workflow is best if your project is some kind of application, like a Django application.
Then it makes sense to freeze your dependencies at the exact versions that are known to be safe.
In fact, having a requirements.txt file is the only way to deploy to PaaS providers like Heroku.
And you probably don’t care about building Python eggs.

But your project may be a reusable library, something other projects may depend on.
In this case, you probably want to distribute your package as a Python Egg via PyPI.
The dependencies of your package should be installed automatically.
The only way to get this: declare your dependencies using install_requires keyword in setup.py.

Some projects, like CMS engines, are both reusable packages and standalone web applications.
In this case, it makes sense to declare the requirements both in setup.py and requirements.txt.
Requirements are generally broader in setup.py in order to avoid conflicts.
And it’s a common practice to have “locked versions” in requirements.txt.