Deploy python working environments with pip under restricted connections
∞

Background

One can often use pre-built python distributions to setup python working
environments, for example Anaconda,
Enthought Canopy. But these distributions has the
following common problems, which reduces their usefulness:

It’s hard to upgrade packages without internet connections. It’s also hard
to install packages not contained in the distribution.

It does not play well with existing packages on the target systems. For
example, Anaconda bundles mpich, which conflict with system MPIs.

These distributions has poor performances. Since they are built on
relatively old platforms and possibly with no optimizations, they are
often slower than native python. For example, I found anaconda python on a
RHEL 6.3 system very slow to start.

For me, it’s essential to deploy my python working environments with the
following considerations:

Easy to install/upgrade packages on demand, under restricted connections.
Some systems I work with does not have internet connections.

It shall integrate well with existing python packages. Some packages such
as PyQt5 requires deep system integration so I would better install them
with system package manager.

Good performance. The distributed packages shall run fast and be
responsive on the target platform.

The key ideas

The first is pip user install. With pip, we have the most complete
python package collection. Almost all packages are available in PyPI, and we
can always make home-brew python packages pip-compatible. pip is a standard
python module since python 2.7.6 and is easy to install for prior python
versions. So pip serves as a built-in package manager. pip supports
installing packages in the users directory and plays well with existing
packages. For example, pip install will skip a package if already available
and capable. pip user install adds paths to standard python module paths,
instead of overwriting them.

The second is python requirement file. pip supports install packages and
its dependencies from a requirement file. That is, you list packages you
want, pip automatically solves the dependency, grabs proper packages and
install them. It makes both install and upgrade a piece of cake.

The last is python wheels. Python wheels is simply pre-built python
packages. The real power with python wheels is that you can build your own
wheel for packages in a requirement list, together with all dependencies.
After that, you can install directly from wheels without the need of a network
connection.

In practice

With the above three key ideas, I develop the following method to solve the
python working environment distribution problem:

Set up a base CPython version. One can choose to rely on system python, or
compile their own CPython.

Maintain a requirement file. Only list packages you need in the
requirements. You can also specify version constraints.