README.md

Toxiproxy

Toxiproxy is a framework for simulating network conditions. It's made
specifically to work in testing, CI and development environments, supporting
deterministic tampering with connections, but with support for randomized chaos
and customization. Toxiproxy is the tool you need to prove with tests that
your application doesn't have single points of failure. We've been
successfully using it in all development and test environments at Shopify since
October, 2014. See our blog post on resiliency for more information.

Toxiproxy usage consists of two parts. A TCP proxy written in Go (what this
repository contains) and a client communicating with the proxy over HTTP. You
configure your application to make all test connections go through Toxiproxy
and can then manipulate their health via HTTP. See Usage
below on how to set up your project.

For example, to add 1000ms of latency to the response of MySQL from the Ruby
client:

Why yet another chaotic TCP proxy?

The existing ones we found didn't provide the kind of dynamic API we needed for
integration and unit testing. Linux tools like nc and so on are not
cross-platform and require root, which makes them problematic in test,
development and CI environments.

Clients

Example

Let's walk through an example with a Rails application. Note that Toxiproxy is
in no way tied to Ruby, it's just been our first use case. You can see the full example at
sirupsen/toxiproxy-rails-example.
To get started right away, jump down to Usage.

For our popular blog, for some reason we're storing the tags for our posts in
Redis and the posts themselves in MySQL. We might have a Post class that
includes some methods to manipulate tags in a Redis set:

classPost < ActiveRecord::Base# Return an Array of all the tags.deftagsTagRedis.smembers(tag_key)
end# Add a tag to the post.defadd_tag(tag)
TagRedis.sadd(tag_key, tag)
end# Remove a tag from the post.defremove_tag(tag)
TagRedis.srem(tag_key, tag)
end# Return the key in Redis for the set of tags for the post.deftag_key"post:tags:#{self.id}"endend

We've decided that erroring while writing to the tag data store
(adding/removing) is OK. However, if the tag data store is down, we should be
able to see the post with no tags. We could simply rescue the
Redis::CannotConnectError around the SMEMBERS Redis call in the tags
method. Let's use Toxiproxy to test that.

Since we've already installed Toxiproxy and it's running on our machine, we can
skip to step 2. This is where we need to make sure Toxiproxy has a mapping for
Redis tags. To config/boot.rb (before any connection is made) we add:

The tests pass! We now have a unit test that proves fetching the tags when Redis
is down returns an empty array, instead of throwing an exception. For full
coverage you should also write an integration test that wraps fetching the
entire blog post page when Redis is down.

If using Toxiproxy from the host rather than other containers, enable host networking with --net=host.

Source

If you have Go installed, you can build Toxiproxy from source using the make file:

$ make build
$ ./toxiproxy-server

Upgrading from Toxiproxy 1.x

In Toxiproxy 2.0 several changes were made to the API that make it incompatible with version 1.x.
In order to use version 2.x of the Toxiproxy server, you will need to make sure your client
library supports the same version. You can check which version of Toxiproxy you are running by
looking at the /version endpoint.

See the documentation for your client library for specific library changes. Detailed changes
for the Toxiproxy server can been found in CHANGELOG.md.

2. Populating Toxiproxy

When your application boots, it needs to make sure that Toxiproxy knows which
endpoints to proxy where. The main parameters are: name, address for Toxiproxy
to listen on and the address of the upstream.

Some client libraries have helpers for this task, which is essentially just
making sure each proxy in a list is created. Example from the Ruby client:

We recommend a naming such as the above: <app>_<env>_<data store>_<shard>.
This makes sure there are no clashes between applications using the same
Toxiproxy.

For large application we recommend storing the Toxiproxy configurations in a
separate configuration file. We use config/toxiproxy.json. This file can be
passed to the server using the -config option, or loaded by the application
to use with the populate function.

Use ports outside the ephemeral port range to avoid random port conflicts.
It's 32,768 to 61,000 on Linux by default, see
/proc/sys/net/ipv4/ip_local_port_range.

3. Using Toxiproxy

To use Toxiproxy, you now need to configure your application to connect through
Toxiproxy. Continuing with our example from step two, we can configure our Redis
client to connect through Toxiproxy:

# old straight to redis
redis =Redis.new(port:6380)
# new through toxiproxy
redis =Redis.new(port:22220)

Now you can tamper with it through the Toxiproxy API. In Ruby:

redis =Redis.new(port:22220)
Toxiproxy[:shopify_test_redis_master].downstream(:latency, latency:1000).apply do
redis.get("test") # will take 1send

The stream direction must be either upstream or downstream. upstream applies
the toxic on the client -> server connection, while downstream applies the toxic
on the server -> client connection. This can be used to modify requests and responses
separately.

Endpoints

All endpoints are JSON.

GET /proxies - List existing proxies and their toxics

POST /proxies - Create a new proxy

POST /populate - Create or replace a list of proxies

GET /proxies/{proxy} - Show the proxy with all its active toxics

POST /proxies/{proxy} - Update a proxy's fields

DELETE /proxies/{proxy} - Delete an existing proxy

GET /proxies/{proxy}/toxics - List active toxics

POST /proxies/{proxy}/toxics - Create a new toxic

GET /proxies/{proxy}/toxics/{toxic} - Get an active toxic's fields

POST /proxies/{proxy}/toxics/{toxic} - Update an active toxic

DELETE /proxies/{proxy}/toxics/{toxic} - Remove an active toxic

POST /reset - Enable all proxies and remove all active toxics

GET /version - Returns the server version number

Populating Proxies

Proxies can be added and configured in bulk using the /populate endpoint. This is done by
passing an json array of proxies to toxiproxy. If a proxy with the same name already exists,
it will be compared to the new proxy and replaced if the upstream and listen address don't match.

A /populate call can be included for example at application start to ensure all required proxies
exist. It is safe to make this call several times, since proxies will be untouched as long as their
fields are consistent with the new data.

$ redis-cli -p 26379
Could not connect to Redis at 127.0.0.1:26379: Connection refused

Frequently Asked Questions

How fast is Toxiproxy? The speed of Toxiproxy depends largely on your hardware,
but you can expect a latency of < 100µs when no toxics are enabled. When running
with GOMAXPROCS=4 on a Macbook Pro we acheived ~1000MB/s throughput, and as high
as 2400MB/s on a higher end desktop. Basically, you can expect Toxiproxy to move
data around at least as fast the app you're testing.

Can Toxiproxy do randomized testing? Many of the available toxics can be configured
to have randomness, such as jitter in the latency toxic. There is also a
global toxicity parameter that specifies the percentage of connections a toxic
will affect. This is most useful for things like the timeout toxic, which would
allow X% of connections to timeout.

I am not seeing my Toxiproxy actions reflected for MySQL. MySQL will prefer
the local Unix domain socket for some clients, no matter which port you pass it
if the host is set to localhost. Configure your MySQL server to not create a
socket, and use 127.0.0.1 as the host. Remember to remove the old socket
after you restart the server.

Should I run a Toxiproxy for each application? No, we recommend using the
same Toxiproxy for all applications. To distinguish between services we
recommend naming your proxies with the scheme: <app>_<env>_<data store>_<shard>.
For example, shopify_test_redis_master or shopify_development_mysql_1.

Development

make. Build a toxiproxy development binary for the current platform.

make all. Build Toxiproxy binaries and packages for all platforms. Requires
to have Go compiled with cross compilation enabled on Linux and Darwin (amd64)
as well as fpm in your $PATH to
build the Debian package.

make test. Run the Toxiproxy tests.

make darwin. Build binary for Darwin.

make linux. Build binary for Linux.

make windows. Build binary for Windows.

Release

Ensure this release has run internally for Shopify/shopify for at least a
day which is the best fuzzy test for robustness we have.