Not sure how to structure your Go web application?

My new book guides you through the start-to-finish build of a real world web application in Go — covering topics like how to structure your code, manage dependencies, create dynamic database-driven pages, and how to authenticate and authorize users securely.

How to Rate Limit HTTP Requests

If you're running a HTTP server and want to rate limit user requests, the go-to package to use is probably Tollbooth by Didip Kerabat. It's well maintained, has a good range of features and a clean and clear API.

But if you want something simple and lightweight – or just want to learn – it's not too difficult to roll your own middleware to handle rate limiting. In this post I'll run through the essentials of how to do that.

If you'd like to follow along, you'll need to install the x/time/rate package. This provides a token bucket rate-limiter algorithm (which is also used by Tollbooth behind the scenes).

$ go get golang.org/x/time/rate

Then create a demo directory containing two files: limit.go and main.go.

$ mkdir ratelimit-demo
$ cd ratelimit-demo
$ touch limit.go main.go

Let's start by making a global rate limiter which acts on all the requests that a HTTP server receives.

A Limiter controls how frequently events are allowed to happen. It implements a "token bucket" of size b, initially full and refilled at rate r tokens per second.

Or to describe it another way – the limiter permits you to consume an average of r tokens per second, with a maximum of b tokens in any single 'burst'. So in the code above our limiter allows 2 tokens to be consumed per second, with a maximum burst size of 5.

In the limit middleware function we call the global limiter's Allow() method each time the middleware receives a HTTP request. If there are no tokens left in the bucket Allow() will return false and we send the user a 429 Too Many Requests response. Otherwise, calling Allow() will consume exactly one token from the bucket and we pass on control to the next handler in the chain.

It's important to note that the code behind the Allow() method is protected by a mutex and is safe for concurrent use.

Let's put this to use. Open up the main.go file and setup a simple web server which uses the limit middleware like so:

Rate limiting per user

While having a single, global, rate limiter is useful in some cases, another common scenario is implement a rate limiter per user, based on an identifier like IP address or API key. In this post we'll use IP address as the identifier.

A conceptually straightforward way to do this is to create a map of rate limiters, using the identifier for each user as the map key.

At this point you might think to reach for the new sync.Map type that was introduced in Go 1.9. This essentially provides a concurrency-safe map, designed to be accessed from multiple goroutines without the risk of race conditions. But it comes with a note of caution:

It is optimized for use in concurrent loops with keys that are stable over time, and either few steady-state stores, or stores localized to one goroutine per key.

For use cases that do not share these attributes, it will likely have comparable or worse performance and worse type safety than an ordinary map paired with a read-write mutex.

In our particular use-case the map keys will be the IP address of users, and so new keys will be added to the map each time a new user visits our application. We'll also want to prevent undue memory consumption by removing old entries from the map when a user hasn't been seen for a long period of time.

So in our case the map keys won't be stable and it's likely that an ordinary map protected by a mutex will perform better. (If you're not familiar with the idea of mutexes or how to use them in Go, then this post has an explanation which you might want to read before continuing).