Update (2016-03-20): Starting with LLVM 3.7, the instructions shown here for
installing llvmlite from source may not work. See the main page of my
PyKaleidoscope repository for
up-to-date details.

A while ago I wrote a short post about employing llvmpy
to invoke LLVM from within Python. Today I want to demonstrate an alternative
technique, using a new library called llvmlite. llvmlite was created last year by the
developers of Numba (a JIT compiler for scientific Python), and just recently
replaced llvmpy as their bridge to LLVM. Since the Numba devs were also the most
active maintainers of llvmpy in the past couple of years, I think llvmlite is
definitily worth paying attention to.

One of the reasons the Numba devs decided to ditch llvmpy in favor of a new
approach is the biggest issue heavy users of LLVM as a library have - its
incredible rate of API-breaking change. The LLVM C++ API is notoriously unstable
and will remain this way for the foreseeable future. This leaves library users
and all kinds of language bindings (like llvmpy) in a constant chase after the
latest LLVM release, if they want to benefit from improved optimizations, new
targets and so on. The Numba developers felt this while maintaining llvmpy and
decided on an alternative approach that will be easier to keep stable going
forward. In addition, llvmpy's architecture made it slow for Numba's users -
llvmlite fixes this as well.

The main idea is - use the LLVM C API as much as possible. Unlike the core C++
API, the C API is meant for facing external users, and is kept relatively
stable. This is what llvmlite does, but with one twist. Since building the IR
using repeated FFI calls to LLVM proved to be slow and error-prone in llvmpy,
llvmlite re-implemented the IR builder in pure Python. Once the IR is built, its
textual representation is passed into the LLVM IR parser. This also reduces the
"API surface" llvmlite uses, since the textual representation of LLVM IR is one
of the more stable things in LLVM.

I found llvmlite pretty easy to build and use on Linux (though it's portable to
OS X and Windows as well). Since there's not much documentation yet, I thought
this post may be useful for others who wish to get started.

After cloning the llvmlite repo, I
downloaded the binary release of LLVM 3.5 -
pre-built binaries for Ubuntu 14.04 mean there's no need to compile LLVM
itself. Note that I didn't install LLVM, just downloaded and untarred it.

Next, I had to install the libedit-dev package with apt-get, since it's
required while building llvmlite. Depending on what you have lying around on
your machine, you may need to install some additional -dev packages.

Now, time to build llvmlite. I chose to use Python 3.4, but any modern version
should work (for versions below 3.4 llvmlite currently requires the enum34
package). LLVM has a great tool named llvm-config in its binary image, and
the Makefile in llvmlite uses it, which means building llvmlite with any
version of LLVM I want is just a simple matter of running:

$ LLVM_CONFIG=<path/to/llvm-config> python3.4 setup.py build

This compiles the C/C++ parts of llvmlite and links them statically to LLVM.
Now, you're ready to use llvmlite. Again, I prefer not to install things
unless I really have to, so the following script can be run with:

$ PYTHONPATH=$PYTHONPATH:<path/to/llvmlite> python3.4 basic_sum.py

Replace the path with your own, or just install llvmlite into some
virtualenv.

And the sample code does the same as the previous post -
creates a function that adds two numbers, and JITs it:

fromctypesimportCFUNCTYPE,c_intimportsysimportllvmlite.irasllimportllvmlite.bindingasllvmllvm.initialize()llvm.initialize_native_target()llvm.initialize_native_asmprinter()# Create a new module with a function implementing this:## int sum(int a, int b) {# return a + b;# }module=ll.Module()func_ty=ll.FunctionType(ll.IntType(32),[ll.IntType(32),ll.IntType(32)])func=ll.Function(module,func_ty,name='sum')func.args[0].name='a'func.args[1].name='b'bb_entry=func.append_basic_block('entry')irbuilder=ll.IRBuilder(bb_entry)tmp=irbuilder.add(func.args[0],func.args[1])ret=irbuilder.ret(tmp)print('=== LLVM IR')print(module)# Convert textual LLVM IR into in-memory representation.llvm_module=llvm.parse_assembly(str(module))tm=llvm.Target.from_default_triple().create_target_machine()# Compile the module to machine code using MCJITwithllvm.create_mcjit_compiler(llvm_module,tm)asee:ee.finalize_object()print('=== Assembly')print(tm.emit_assembly(llvm_module))# Obtain a pointer to the compiled 'sum' - it's the address of its JITed# code in memory.cfptr=ee.get_pointer_to_function(llvm_module.get_function('sum'))# To convert an address to an actual callable thing we have to use# CFUNCTYPE, and specify the arguments & return type.cfunc=CFUNCTYPE(c_int,c_int,c_int)(cfptr)# Now 'cfunc' is an actual callable we can invokeres=cfunc(17,42)print('The result is',res)

This should print the LLVM IR for the function we built, its assembly as
produced by LLVM's JIT compiler, and the result 59.

Compared to llvmpy, llvmlite now seems like the future, mostly due to the
maintenance situation. llvmpy is only known to work with LLVM up to 3.3, which
is already a year and half old by now. Having just been kicked out of Numba,
there's a good chance it will fall further behind. llvmlite, on the other hand,
is very actively developed and keeps track with the latest stable LLVM release.
Also, it's architectured in a way that should make it significantly easier to
keep up with LLVM in the future. Unfortunately, as far as uses outside of Numba
go, llvmlite is still rough around the edges, especially w.r.t. documentation
and examples. But the llvmlite developers appear keen on making it useful in a
more general setting and not just for Numba, so that's a good sign.