Self-hosting serverless with OpenFaaS

As a weekend project I decided to give OpenFAAS a go. It is essentially AWS Lambda running locally or on your own servers.

May 3, 2020 - 9 min read

openfaas

Introduction

OpenFaas is self-hosted functions as a service. You can think of it as a self-hosted version of AWS Lambda based on Docker. You build and push Docker images to your OpenFaaS setup and then they will become available for invocation by HTTP requests.

I was initially interested in OpenFaas because I have a number of tiny services running on things like Express for Node, Flask for Python and Sinatra for Ruby. Most of these are deployed using Dokku which is a great way to deploy your apps. However, it is a bit tedious to create a "full project" just to get HTTP access to some simple functions. I hoped that OpenFaas would be a great solution to this problem.

Deployment

The recommended way to deploy OpenFAAS is to use Kubernetes. That is however not meant to be used on a single tiny little server but rather in a serious production environment. You can also use Docker Swarm instead of Kubernetes but the problems remain.

There is a faster and simpler way to deploy OpenFaaS by using faasd. I used that on a new server to test it out and it was a breeze to get going with cloudinit. But I thought that since I already had Docker running apps in production on my other server then why not just use Docker Swarm instead? Turns out that this was very easy to get going as well. There is a simple deploy script which essentially sets up the whole thing with a single command. Why not Kubernetes? I don't want to run Kubernetes just to be able to run a single thing and Kubernetes has a lot more moving parts and configuration needed.

While the setup was easy using Docker Swarm and everything worked well there were a couple of serious problems. Since it is meant for production use it is not optimised for single servers and has a lot of CPU overhead for things that you don't have any use for on a single server. In the end I scrapped the Docker setup and went back to using faasd.

Private docker registryTo take full advantage of OpenFaas you need a Docker registry. You have a few options for this:

You can pay Dockerhub for closed source support. This is about $7 a month.

You can deploy your own docker registry.

I went with the third option. I have no issues paying for things I like but we were going for dirt cheap here so free is best. It took me about 30 minutes to set up a working registry with SSL and everything using this guide, so this was a great solution. Make sure you follow the guides in OpenFaaS about how to use this since you have to add some special configuration to get authentication working. As an added bonus I now have a private Docker registry that i can use for whatever I want.

Why not AWS Lambda + Serverless frameworkThe idea here was to try some self hosted open source fun times. AWS Lambda can do more than OpenFaaS since you have access to the entire AWS ecosystem. Lambda currently supports Java, Go, PowerShell, Node.js, C#, Python, and Ruby. You can run other things but it becomes a bit of a pain to do. OpenFaaS will run just about anything as long as it can run inside a container. So Rust is an option for example.

On AWS, iff you need some storage you can easily provision a DynamoDB and store it there. For OpenFaaS you'd need to host some kind of key value store like Redis or a proper database for this. I haven't tried linking these to the containers running OpenFaaS functions but there are several ways to do this. The purpose of my setup is not to host some kind of crazy big serverless stack but rather to have a quick way to experiment and run small things without too much overhead like a server. If I need Redis or something I am comfortable setting that up myself.

Read the template source code

A template is the base of any OpenFaaS function. There are templates for just about any programming language out there as well as a fully custom one which just supplies a Dockerfile

I cannot stress enough how important it is to read the source code for these templates. You need to do it to figure out what is going on and how the handlers work. For this to be effective you need to have some grasp of Docker fundamentals. You also need to read the source code for the handler logic in order to understand what is sent to your functions and what is expected in return. Finally, this is also important if you ever want to write your own templates.

Polyglot programming

One of the major benefits for me when using OpenFaaS is to be able to toss together any tiny little thing in any programming language and OpenFaaS will just run it with no issues. I can quickly change from using Ruby for a function to using Python for example and Openfaa will handle it just fine. I use Puppeteer with node, Newspaper with python and so on. This is perhaps not very effective when it comes to pure efficiency and getting-things-done, but it is a lot of fun to jump around between programming languages without having the barrier of learning to host the resulting applications and any boilerplate that a framework brings. I wouldn't recommend doing this if you're short on time however.

Hosting on Hetzner

Faasd, and OpenFaaS, will run on pretty much any Linux server you can think of. They have tutorials for DigitalOcean and such but any service will work. Since I have experience with Hetzner I choose them as my hosting service. For my server I chose the second cheapest one they have. it is a brand new server offering with 2 cpu cores, 2gb of ram and 40 gb of disk. This turned out to be plenty for what I was trying to do so you can probably deploy faasd to something even smaller if you wish. The server is sitting on pretty much zero load when the functions are not running, so there is very little overhead in the faasd service itself. The memory usage is also low. The server uses about 600 MB while no functions are running, including the Linux system itself. Again, faasd is made for deployment on a single server. If you want multiple servers with failover and such then you need to use the default method of Kubernetes or Docker Swarm.

Developing and testing locally

Since everything is running on docker you need to have Docker to run OpenFaaS because the images are built locally and then pushed to the registry. While you can certainly deploy your functions and test them on the OpenFaaS server itself that will be rather bad if the functions don't work for example,. Initially I had a lot of trouble figuring out how to run the functions locally but once I realised that this is just Docker it became apparent. You fire up a one-off Docker container running your function with something like faas build -f function.yml && docker run --rm -p 3000:8080 docker-registry.myserver.com/function:latest. While this is ok for development you need to keep in mind that this takes a while to run since it has to build the Docker image every time. The layers are cached but there is still a delay. This takes about 10 seconds to do on my machine. I suppose you can set up some kind of mount in the container and such so that you can develop locally without rebuilding the container every time but that is for another day. Rebuilding the container works for now, since the functions are so small and don't have a lot of moving parts inside the container.

Some templates, like the Node ones, come with a built in test hook that will run while the image is building. It will abort the build if the tests fail so that is pretty nice! You should be able to add such hooks with ease to other templates as well if you want.

Continuous integration

I haven't looked into this further but since this is just Docker in the end it should be quite easy to run on any CI service which supports Docker. I'm a fan of Circle CI and they have Docker support so you should be able to use them to build the image and run any tests you before you push it to the Docker registry. You might have to install the faas-cli tooling in the build but that should be pretty easy to do. I haven't looked into this but it seems like a simple way to do it.

Use cases

My plan with all of this was to use OpenFaaS for deploying things that are too small to run a whole service. For example I made a function that wraps the Newspaper3k package in Python to extract data from articles online. This is all you need to build the script to do something like this in OpenFaaS(you'll need a requirements.txt file too to include Newspaper):

The handle function is the main hook for the functions you write and most templates have something very similar to this. You need to read the template code to figure out what the event is so that you know what to do with it. You also need to read the code to figure out what the handle function expects in return.

Now I have a hosted function to extract article data from any article online. Neat!

Another thing that I wanted to do is have a function that renders a web page in Puppeteer. Not terribly exciting but since a lot of pages out there don't work unless javascript is enabled it can be hard to do web scraping for example. I built a function that will take any URL and render the page with Javascript enabled and return the resulting HTML. This was a bit more involved since I needed to have a chrome runtime in my docker image so I had to build a custom template for it. This is too much for this article but I it is available on Github.

There are of course tons of other use cases for something like with endless combinations of functions. The best part for me is that now I don't have to create a whole project just to add HTTP access to some library, I can just use OpenFAAS for it.

Monorepo

I decided to keep all my functions in a single repo. They are small enough to not warrant their own repositories and its nice to have everything in a single place. I only have like 10 functions so far so it's hard to tell what will happen in the future.

Prometheus

OpenFaaS comes with Prometheus. It will monitor the functions and report back memory usage, cpu time, invocation count and so on. You can choose to graph this in Prometheus directly or choose to use something else like Grafana. Since I already had a Grafana setup I used that for monitoring. Worked like a charm and a very nice thing to have out of the box.

Async functions

I only briefly touched on this but OpenFaaS has a powerful mechanic that allows you to run asynchronous functions. Instead of expecting a result when the function is done running it will return an id. The request will then be put into a queue and will be processed when there is an available worker. When that is complete it will make another request to a callback url that you can supply when the function is invoked. This can then be scaled up to allow for multiple workers running to work through the queue faster.

This will of course make a lot more sense in a distributed environment running on Kubernetes or Docker Swarm. Still, it will be a great fit for functions that don't need to be processed immediately, like sending email or uploading files.

Secrets

Secret data like API keys and such are handled by the use of a secrets file if you're using FaaS. If you're using Kubernetes or docker swarm you can use their native secrets management instead. In the containers the secrets are available as a file on disk which you have to read at runtime. This might seem a bit clunky but I suppose it's pretty easy ti work with.

Cli Tooling

OpenFaaS has it's own cli tool called faas-cli. It is the main tool for creating new functions, deploying functions, and so on. For example you run faas-cli up -f myfunction.yml to build, upload and deploy a function.

What now?

Overall I'm very happy with my little VPS running OpenFaaS. I will keep adding new functions and will push some production requests to it soon to replace my old services. After that I think that I'll take a closer look at async functions and then perhaps experiment with a proper production setup on multiple servers. Stay tuned for more articles about OpenFaaS!