Using Python’s LXML in Amazon Lambda

By Hector Castro on June 27th, 2016

Share:

Using Python’s LXML in Amazon Lambda

Share:

We recently set out to do some XML processing within Amazon Lambda at Azavea using Python and the LXML library. Once it came time to deploy the function, we realized that the standard method for creating a deployment package was not going to cut it. Why? Because lxml must be built with C extensions for libxml2 and libxslt in a way that plays well with the Amazon Lambda execution environment.

Deployment packages

Amazon already has some pretty straightforward documentation around creating deployment packages for Lambda that make use of pip and virtualenv. For pure Python dependencies, the packaging process can look something like this:

$ pip install requests -t .
$ zip -r9 package.zip main.py requests/

For dependencies with C extensions, things get a little more complicated because the C extensions themselves must be compiled against system libraries like those in the Amazon Lambda execution environment. Luckily, we know the execution environment runs Amazon Linux, right down to the Amazon Machine Image (AMI) ID and Linux kernel version.

Launching an Amazon Linux instance with the AWS CLI

In an attempt to smooth out the process of launching an Amazon Linux instance, below is a one line command to launch the current Amazon Linux AMI for Amazon Lambda as a t2.micro within the default VPC of an AWS account:

Repackaging and deployment

At this point, we have an lxml bundle that is ready for Amazon Lambda, but it lives on an Amazon Linux EC2 instance without any of our function code. Assuming the code you want to deploy is on your local workstation, the following steps go through the process of downloading the lxml bundle and repackaging it with our Lambda function code.