Creating Python snaps

Background

Snappy Ubuntu Core is a new edition of the Ubuntu you know and love, with
some interesting new features, including atomic, transactional updates, and a
much more lightweight application deployment story than traditional
Debian/Ubuntu packaging. Much of this work grew out of our development of a
mobile/touch based version of Ubuntu for phones and tablets, but now Ubuntu
Core is available for clouds and devices.

I find the transactional nature of upgrades to be very interesting. While you
still get a perfectly normal Ubuntu system, your root file system is
read-only, so traditional apt-get based upgrades don't work. Instead, your
system version is image based; today you are running image 231 and tomorrow
a new image is released to get you to 232. When you upgrade to the new image,
you get all the system changes. We support both full and delta upgrades
(the latter which reduces bandwidth), and even phased updates so that we can
roll out new upgrades and quickly pull them from the server side if we notice
a problem. Snappy devices even support rolling back upgrades on a single
device, by using a dual-partition root file system. Phones generally don't
support this due to lack of available space on the device.

Of course, the other part really interesting thing about Snappy is the
lightweight, flexible approach to deploying applications. I still remember my
early days learning how to package software for Debian and Ubuntu, and now
that I'm both an Ubuntu Core Developer and Debian Developer, I understand
pretty well how to properly package things. There's still plenty of black art
involved, even for relatively easy upstream packages such as
distutils/setuptools-based Python packages available on the Cheeseshop (er,
PyPI). The Snappy approach on Ubuntu Core is much more lightweight and easy,
and it doesn't require the magical approval of the archive elves, or the
vagaries of PPAs, to make your applications quickly available to all your
users. There's even a robust online store for publishing your apps.

There's lots more about Snappy apps and Ubuntu Core that I won't cover here,
so I encourage you to follow the links for more information. You might also
want to stop now and take the tour of Ubuntu Core (hey, I'm a poet and I
didn't even realize it).

In this post, I want to talk about building and deploying snappy Python
applications. Python itself is not an officially supported development
framework, but we have a secret weapon. The system image client upgrader --
i.e. the component on the devices that checks for, verifies, downloads, and
applies atomic updates -- is written in Python 3. So the core system provides
us with a full-featured Python 3 environment we can utilize.

The question that came to mind is this: given a command-line application
available on PyPI, how easy is it to turn into a snap and install it on an
Ubuntu Core system? With some caveats I'll explore later, it's actually
pretty easy!

Basic approach

The basic idea is this: let's take a package on PyPI, which may have
additional dependencies also on PyPI, download them locally, and build them
into a snap that we can install on an Ubuntu Core system.

The first question is, how do we build a local version of a fully-contained
Python application? My initial thought was to build a virtual environment
using virtualenv or pyvenv, and then somehow turn that virtual environment
into a snap. This turns out to be difficult in practice because virtual
environments aren't really designed for this. They have issues with being
relocated for example, and they can contain a lot of extraneous stuff that's
great for development (virtual environment's actual purpose ) but unnecessary
baggage for our use case.

My second thought involved turning a Python application into a single file
executable, and from there it would be fairly easy to snappify. Python has a
long tradition of such tools, many with varying degrees of cross platform
portability and standalone-ishness. After looking again at some oldies but
goodies (e.g. cx_freeze) and some new offerings, I decided to start with
pex.

pex is a nice tool developed by Brian Wickman and the Twitter folks which they
use to deploy Python applications to their production environment. pex takes
advantage of modern Python's support for zip imports, and a clever trick of
zip files.

Python supports direct imports (of pure Python modules) from zip files, and
the python executable's -m option works even when the module is inside
a zip file. Further, the presence of a __main__.py file within a package
can be used as shorthand for executing the package, e.g. python -m myapp
will run myapp/__main__.py if it exists.

Zip files are interesting because their index is at the end of the file. This
allows you to put whatever you want at the front of the file and it will still
be considered a zip file. pex exploits this by putting a shebang in the first
line of the file, e.g. #!/usr/bin/python3 and thus the entire zip file
becomes a single file executable of Python code.

There are of course, plenty of caveats. Probably the main one is that Python
cannot import extension modules directly from the zip, because the
dlopen() function call only takes a file system path. pex handles this by
marking the resulting file as not zip safe, so the zip is written out to a
temporary directory first.

The other issue of course, is that the zip file must contain all the
dependencies not present in the base Python. pex is actually fairly smart
here, in that it will chase dependencies, much like pip and it will include
those dependencies in the zip file. You can also specify any missed
dependencies explicitly on the pex command line.

Once we have the pex file, we need to add the required snappy metadata and
configuration files, and run the snappy command to generate the .snap
file, which can then be installed into Ubuntu Core. Since we can extract
almost all of the minimal required snappy metadata from the Python package
metadata, we only need just a little input from the user, and the rest of work
can be automated.

We're also going to avail ourselves of a convenient cheat. Because Python 3
and its standard library are already part of Ubuntu Core on a snappy device,
we don't need to worry about any of those dependencies. We're only going to
support Python 3, so we get its full stdlib for free. If we needed access to
Python 2, or any external libraries or add-ons that can't be made part of the
zip file, we would need to create a snappy framework for that, and then
utilize that framework for our snappy app. That's outside the scope of this
article though.

Requirements

To build Python snaps, you'll need to have a few things installed. If you're
using Ubuntu 15.04, just apt-get install the appropriate packages.
Otherwise, you can get any additional Python requirements by building a
virtual environment and installing tools like pex and wheel into their, then
invoking pex from that virtual environment. But let's assume you have the
Vivid Vervet (Ubuntu 15.04); here are the packages you need:

python3
python-pex-cli
python3-wheel
snappy-tools
git

You'll also want a local git clone of https://gitlab.com/warsaw/pysnap.git
which provides a convenient script called snap.py for automating the
building of Python snaps. We'll refer to this script extensively in the
discussion below.

For extra credit, you might want to get a copy of Python 3.5 (unreleased as
of this writing). I'll show you how to do some interesting debugging with
Python 3.5 later on.

From PyPI to snap in one easy step

Let's start with a simple example: world is a very simple script that can
provide forward and reverse mappings of ISO 3166 two letter country codes
(at least as of before ISO once again paywalled the database). So if you get
an email from guido@example.py you can find out where the BDFL has his secret
lair:

$ world py
py originates from PARAGUAY

world is a pure-Python package with both a library and a command line
interface. To get started with the snap.py script mentioned above, you
need to create a minimal .ini file, such as:

[project]name: world[pex]verbose: true

Let's call this file world.ini. (In fact, you'll find this very file
under the examples directory in the snap git repository.) What do the various
sections and variables control?

name is the name of the project on PyPI. It's used to look up metadata
about the project on PyPI via PyPI's JSON API.

verbose variable just defines whether to pass -v to the underlying
pex command.

Now, to create the snap, just run:

$ ./snap.py examples/world.ini

You'll see a few progress messages and a warning which you can ignore. Then
out spits a file called world_3.1.1_all.snap. Because this is pure
Python, it's architecture independent. That's a good thing because the snap
will run on any device, such as a local amd64 kvm instance, or an ARM-based
Ubuntu Core-compatible Lava Lamp.

Armed with this new snap, we can just install it on our device (in this case,
a local kvm instance) and then run it:

From git repository to snap in one easy step

Let's look at another example, this time using a stupid project that contains
an extension module. This aptly named package just prints a "yes" for every
-y argument, and "no" for every -n argument.

The difference here is that stupid isn't on PyPI; it's only available via
git. The snap.py helper is smart enough to know how to build snaps from
git repositories. Here's what the stupid.ini file looks like:

Notice that there's a [project]origin variable. This just says that the
origin of the package isn't PyPI, but instead a git repository, and then the
public repo url is given. The first word is just an arbitrary protocol tag;
we could eventually extend this to handle other version control systems or
origin types. For now, only git is supported.

To build this snap:

$ ./snap.py examples/stupid.ini

This clones the repository into a temporary directory, builds the Python
package into a wheel, and stores that wheel in a local directory. pex has the
ability to build its pex file from local wheels without hitting PyPI, which we
use here. Out spits a file called stupid_1.1a1_all.snap, which we can
install in the kvm instance using the snappy-remote command as above, and
then run it after ssh'ing in:

ubuntu@localhost:~$ stupid.stupid -ynnyn
yes
no
no
yes
no

Watch out though, because this snap is really not architecture-independent.
It contains an extension module which is compiled on the host platform, so it
is not portable to different architectures. It works on my local kvm
instance, but sadly not on my Lava Lamp.

Entry points

pex currently requires you to explicitly name the entry point of your Python
application. This is the function which serves as your main and it's what
runs by default when the pex zip file is executed.

Usually, a Python package will define its entry point in its setup.py file,
like so:

And if you have a copy of the package, you can run a command to generate the
various package metadata files:

$ python3 setup.py egg_info

If you look in the resulting stupid.egg_info/entry_points.txt file, you
see the entry point clearly defined there. Ideally, either pex or snap.py
would just figure this out explicitly. As it turns out, there's already a
feature request open on pex for this, but in the meantime, how can we
auto-detect the entry point?

For the stupid example, it's pretty easy. Once we've cloned its git
repository, we just run the egg_info command and read the
entry_points.txt file. Later, we can build the project's binary wheel
from the same git clone.

It's a bit more problematic with world though because the package isn't
downloaded from PyPI until pex runs, but the pex command line requires that
you specify the entry point before the download occurs.

We can handle this by supporting an entry_point variable in the snap's
.ini file. For example, here's the world.ini file with an explicit
entry point setting:

What if we still wanted to auto-detect the entry point? We could of course,
download the world package in snap.py and run the egg-info command
over that. But pex also wants to download world and we don't want to have
to download it twice. Maybe we could download it in snap.py and then
build a local wheel file for pex to consume?

As it turns out there's an easier way.

Unfortunately, package egg-info metadata is not availble on PyPI, although
arguably it should be. Fortunately, Vinay Sajip runs an external service that
does make the metadata available, such as the metadata for world.

snap.py makes the entry_point variable optional, and if it's missing,
it will grab the package metadata from a link like that given above. An error
will be thrown if the file can't be found, in which case, for now, you'd just
add the [project]entry_point variable to the .ini file.

A little more snap.py detail

The snap.py script is more or less a pure convenience wrapper around
several independent tools. pex of course for creating the single executable
zip file, but also the snappy command for building the .snap file. It
also utilizes python3 setup.py egg_info where possible to extract metadata
and construct the snappy facade needed for the snappy build command. Less
typing for you! In the case of a snap built from a git repository, it also
performs the git cloning, and the python3 setup.py bdist_wheel command to
create the wheel file that pex will consume.

There's one other important thing snap.py does: it fixes the resulting pex
file's shebang line. Because we're running these snaps on an Ubuntu Core
system, we know that Python 3 will be available in /usr/bin/python3. We
want the pex file's shebang line to be exactly this. While pex supports a
--python option to specify the interpreter, it doesn't take the value
literally. Instead, it takes the last path component and passes it to
/usr/bin/env so you end up with a shebang line like:

#!/usr/bin/env python3

That might work, but we don't want the pex file to be subject to the
uncertainties of the $PATH environment variable.

One of the things that snap.py does is repack the pex file. Remember,
it's just a zip file with some magic at the top (that magic is the shebang),
so we just read the file that pex spits out, and rewrite it with the shebang
we want. Eventually, pex itself will handle this and we won't need to do
that anymore.

Debugging

While I was working out the code and techniques for this blog post, I ran into
an interesting problem. The world script would crash with some odd
tracebacks. I don't have the details anymore and they'd be superfluous, but
suffice to say that the tracebacks really didn't help in figuring out the
problem. It would work in a local virtual environment build of world
using either the (pip installed) PyPI package or run from the upstream git
repository, but once the snap was installed in my kvm instance, it would
traceback. I didn't know if this was a bug in world, in the snap I built,
or in the Ubuntu Core environment. How could I figure that out?

Of course, the go to tool for debugging any Python problem is pdb. I'll just
assume you already know this. If not, stop everything and go learn how to use
the debugger.

Okay, but how was I going to get a pdb breakpoint into my snap? This is where
Python 3.5 comes in!

PEP 441, which has already been accepted and implemented in what will be
Python 3.5, aims to improve support for zip applications. Apropos this blog
post, the new zipapp module can be used to zip up a directory into single
executable file, with an argument to specify the shebang line, and a few other
options. It's related to what pex does, but without all the PyPI interactions
and dependency chasing. Here's how we can use it to debug a pex file.

Let's ignore snappy for the moment and just create a pex of the world
application:

$ pex -r world -o world.pex -e worldlib.__main__:main

Now let's say we want to set a pdb breakpoint in the main() function so
that we can debug the program, even when it's a single executable file. We
start by unzipping the pex:

$ mkdir world
$ cd world
$ unzip ../world.pex

If you poke around, you'll notice a __main__.py file in the current
directory. This is pex's own main entry point. There are also two hidden
directories, .bootstrap and .deps. The former is more pex
scaffolding, but inside the latter you'll see the unpacked wheel directories
for world and its single dependency.

Drilling down a little farther, you'll see that inside the world wheel is
the full source code for world itself. Set a break point by visiting
.deps/world-3.1.1-py2.py3-none-any.whl/worldlib/__main__.py in your
editor. Find the main() function and put this right after the def
line:

importpdb;pdb.set_trace()

Save your changes and exit your editor.

At this point, you'll want to have Python 3.5 installed or available. Let's
assume that by the time you read this, Python 3.5 has been released and is the
default Python 3 on your system. If not, you can always download a
pre-release of the source code, or just build Python 3.5 from its Mercurial
repository. I'll wait while you do this...

...and we're back! Okay, now armed with Python 3.5, and still inside the
world subdirectory you created above, just do this:

$ python3.5 -m zipapp . -p /usr/bin/python3 -o ../world.dbg

Now, before you can run ../world.dbg and watch the break point do its
thing, you need to delete pex's own local cache, otherwise pex will execute
the world dependency out of its cache, which won't have the break point
set. This is a wart that might be worth reporting and fixing in pex itself.
For now:

$ rm -rf ~/.pex
$ ../world.dbg

And now you should be dropped into pdb almost immediately.

If you wanted to build this debugging pex into a snap, just use the snappy
build command directly. You'll need to add the minimal metadata yourself
(since currently snap.py doesn't preserve it). See the Snappy developer
documentation for more details.

Summary and Caveats

There's a lot of interesting technology here; pex for building single file
executables of Python applications, and Snappy Ubuntu Core for atomic,
transactional system updates and lightweight application deployment to the
cloud and things. These allow you to get started doing some basic deployments
of Python applications. No doubt there are lots of loose ends to clean up,
and caveats to be aware of. Here are some known ones:

All of the above only works with Python 3. I think that's a feature, but
you might disagree. ;) This works on Ubuntu Core for free because Python 3
is an essential piece of the base image. Working out how to deploy Python 2
as a Snappy framework would be an interesting exercise.

When we build a snap from a git repository for an application that isn't on
PyPI, I don't currently have a way to also grab some dependencies from PyPI.
The stupid example shown here doesn't have any additional dependencies
so it wasn't a problem. Fixing this should be a fairly simple matter of
engineering on the snap.py wrapper (pull requests welcome!)

We don't really have a great story for cross-compilation of extension
modules. Solving this is probably a fairly complex initiative involving the
distros, setuptools and other packaging tools, and upstream Python. For
now, your best bet might be to actually build the snap on the actual target
hardware.

Importing extension modules requires a file system cache because of
limitations in the dlopen() API. There have been rumors of extensions
to glibc which would provide a dlopen() -from-memory type of API which
could solve this, or upstream Python's zip support may want to grow native
support for caching.

Even with these caveats, it's pretty easy to turn a Python application into a
Snappy Ubuntu Core application, publish it to the world, and profit! So what
are you waiting for? Snap to it!