UPDATE: This post is outdated, it was written at a time when CFFI requires a
number of hacks in order to sanely package it for distribution. Since this post
was written CFFI has released its 1.0 version which includes a new API which
makes these hacks no longer required. You can read my new blog post
Distributing a CFFI Project Redux or the CFFI documentation to see how
to distribute projects in a post CFFI 1.0 world.

CFFI is a C Foreign Function Interface for Python. It sits somewhere
between writing a full blown C extension and using the ctypes interface. It is
a great way to call into C code from within Python with a few important
advantages over C extensions, ctypes, and SWIG:

It simply calls C code from Python code, it does not require learning a
DSL (Cython, SWIG) and its API is very minimal (ctypes).

Works sanely and with good performance in both PyPy and CPython and has a
reasonable path for alternative implementations to support it as well.

I’ve used CFFI for awhile now, and I can easily say that I fully recommend it
for any one needing to call into C from Python. However CFFI does have one
particularly gnarly problem: Packaging.

Correctly and sanely distributing an application written using CFFI is an
exercise in frustration requiring a thorough understanding of the packaging
toolchain, CFFI, and Python itself. On top of that CFFI has a sort of
misfeature where it will implicitly compile the generated C extension if it
cannot load one. This is incredibly handy during iterative development but can
wreak havoc on your ability to test the installation of your project as if it
were being deployed.

Minimal Example

Here is a minimal example of using CFFI to be able to call the printf
function from Python:

This example works and if you save it into your current directory and execute
it with PYTHONPATH=. python -m example you’ll get output that looks like:

$ PYTHONPATH=. python -m example
Hi There!

This works because when you call the ffi.verify function CFFI will attempt
to load an already compiled module for this FFI instance, and failing to
find it will implicitly compile a new one and then load it. This particular
feature can be a great boon while iteratively developing a project because you
never have to explicitly compile anything. In effect it makes working on a C
binding as simple and quick as working on a pure Python project.

Packaging our Example Project

Now that we have a simple example.py file we can package this up so that we
can distribute it to other people. We’ll use a simple setup.py taken from
the CFFI docs with some slight modifications to fit our project:

# The CFFI docs suggest that you can also use distutils, while technically
# correct you should use setuptools because otherwise you cannot specify
# a dependency on CFFI.
from setuptools import setup
# you must import at least the module(s) that define the ffi's
# that you use in your application
import example
setup(
name="example",
version="0.1",
py_modules=["example"],
ext_modules=[
example.ffi.verifier.get_extension(),
],
install_requires=[
"cffi",
],
zip_safe=False,
)

Now that we have our setup.py we can go ahead and create a sdist using the
command python setup.py sdist which will give us example-0.1.tar.gz in
the dist/ folder. We can even publish it to PyPI and then let other users
install it using pip install example!

Except they won’t be able to install it because what we actually would have
published is a broken package that relies on:

The python development headers to be installed (If installing into CPython)

The libffi development headers to be installed (If installing into CPython)

CFFI (and dependencies) to be installed.

There isn’t much that can be done about #1 or #2 they will just need to be
documented as required, however for #3 we can utilize a setuptools feature
called setup_requires in order to ensure that CFFI is installed when the
setup.py is executed. Using this feature for CFFI is a little bit ugly
because the items inside of setup_requires will get installed as the first
part of executing the setup() function, however at that point it’s already
too late because we need to be able to pass in the ext_modules into the
setup() call. Luckily distutils/setuptools does provide the right kind of
hooks to make this possible.

Now if we recreate our sdist instead of an error that says something like
ImportError: No module named 'cffi' we’ll get a successful installation
and we can verify that this is the case by executing our module:

$ python -m example
Hi There!

We’ve gotten a sdist that can be sent to PyPI and others can install it,
however there are still a number of issues with our package. These problems
will crop up in strange cases with hard to debug errors. The problems that
we’ll still have are:

The artifacts produced by default by CFFI have a hard dependency on a
particular CFFI version, making it impossible to upgrade CFFI without
rebuilding any package that uses it.

Installing the project does a double compile, one of which will cause
problems for anyone trying to cross compile the software.

The implicit compile which can be very helpful in development will often
mask problems like #2 on a local machine, if you upgrade your version of
CFFI the next time you import the module it will simply implicitly
recompile the C extension. This however will break in common deployment
scenarios where the executing user does not have write permissions to the
site-packages folder or where they installed a binary package and they
do not have a compiler or development headers installed on the machine.

The problem in #1 is that behind the scenes CFFI generates a module name that
it will compile and load. This module name contains a hash of a few things like
the Python version (major and minor), the CFFI version, the string passed into
the FFI instance, and most of the keyword arguments to the
FFI().verify() function. The idea behind this is that if any of these
things changed then the ABI might have changed so it’s a good idea to rebuild
the extension module. The inclusion of the CFFI version causes #1, so to fix it
we’ll compute our own hash and tell CFFI to use it instead.

First we’ll create a function which computes our module name and then we’ll
pass that into the FFI().verify() call so that CFFI will use our computed
module name instead.

Now we can upgrade our CFFI version without needing to recompile all of our
CFFI using projects. Installing this example project still requires building
the C extension twice and the implicit compile is still there lurking in the
shadows waiting to mask hidden errors.

The first of our two compiles is the implicit compile which happens when the
FFI().verify() function is called when the setup.py imports the
example module and the second compile comes from distutils itself compiling
the module for install. We want to only have distutils compile our module
because there is a lot of tooling out there that has learned how to work with
distutils and it will avoid issues like left over files or various cross
compiling woes.

In order to stop CFFI from implicitly compiling on module import we need to
stop calling the FFI().verify() function. However we need the
FFI().verifier object to get the Extension object that we need to pass
into ext_modules() and the FFI().verifier object is setup and created
by the FFI().verify() function. So what we’ll do is instead of calling
FFI().verify() we’ll go ahead and construct our own Verifier() instance
and assign it to FFI().verifier. We’ll also need to call
FFI().verifier.load_library() but we MUST ensure that this does not
happen when importing the module, it MUST be deferred to a later time so
we’ll use a small shim class which will act as a stand in for the loaded
library and will defer loading the library until the first attempt to call
a C function.

The LazyLibrary class will defer the actual loading of the library until
the first time an attribute is accessed on it, and will otherwise just act
as a proxy to the underlying C library. It is important to make sure that you
do not access any attributes on the LazyLibrary() object in a way that
will execute during the import of the module.

Finally we still have the ability to implicitly compile our module. If all goes
well this will never happen during the normal installation and use of our
module, however it is deceptively easy to accidently do something which will
trigger an implicit compile and bring back the kinds of problems that
LazyLibrary works around. Disabling the implicit compile is pretty easy,
however it requires patching the Verifier() instance to replace the
function that CFFI uses to compile modules with one that simply raises an
error.

Now we finally have a simple project that calls into C using CFFI and which
can sanely be distributed to others and deployed onto production systems. This
will also work with all the common binary packages like [Wheels][].

Bonus: “Better” setup_requires

One issue with the setup.py that I’ve written above is that it is going to
install CFFI and all of its dependencies for any invocation of setup.py,
even just for printing out the usage information with
setup.py setup.py --help. This is due to the fact that setuptools doesn’t
really have the concept of a “build” dependency, which is what we really want
here, but instead it only has the concept of a dependency required to execute
the setup.py. Thus setuptools installs the items listed in
setup_requires for any invocation, because it doesn’t know why that item
is in there, just that it is required at some point in its execution.

We can limit this so that setuptools will only install CFFI if required,
however it requires adding more logic to our setup.py. This isn’t strictly
required though users may appreciate being able to query information from the
setup.py without downloading and installing CFFI.

To do this we’ll create a function that will inspect the arguments that
setup.py was called with and determine if any of them are invoking
something which will require CFFI in setup_requires. This function can then
add additional keyword arguments to the setup() function call depending on
if we need CFFI in the setup_requires or not.

Conclusion and the Future

CFFI is a great tool for calling into C from within Python and while it does
have a number of problems when it comes to packaging up software using CFFI
none of those issues are deal breakers or which can’t be worked around in some
fashion. All of the techniques shown here were taken from the cryptography
project which can be used as a reference for any changes to these techniques
as well as an example of them being used in a real life project.

Looking towards the future I plan to upstream these ideas and I will blog again
when they’ve been resolved inside of CFFI itself.