It’s a project at Netflix to enhance the infrastructure tolerance. The Chaos Monkey
will randomly shut down some servers or block some network connections, and the system
is supposed to survive to these events. It’s a way to verify the high availability
and tolerance of the system.

Besides a redundant infrastructure, if you think about reliability at the level
of your web applications there are many questions that often remain unanswered:

What happens if the MYSQL server is restarted? Are your connectors able
to survive this event and continue to work properly afterwards?

Is your web application still working in degraded mode when Membase is
down?

Are you sending back the right 503s when postgresql times out ?

Of course you can – and should – try out all these scenarios on stage while
your application is getting a realistic load.

But testing these scenarios while you are building your code is also a good
practice, and having automated functional tests for this is preferable.

That’s where Vaurien is useful.

Vaurien is basically a Chaos Monkey for your TCP connections. Vaurien
acts as a proxy between your application and any backend.

You can use it in your functional tests or even on a real deployment
through the command-line.

Vaurien is a TCP proxy that simply reads data sent to it and pass it to a
backend, and vice-versa.

It has built-in protocols: TCP, HTTP, Redis & Memcache. The TCP protocol
is the default one and just sucks data on both sides and pass it along.

Having higher-level protocols is mandatory in some cases, when Vaurien needs to
read a specific amount of data in the sockets, or when you need to be aware
of the kind of response you’re waiting for, and so on.

Vaurien also has behaviors. A behavior is a class that’s going to be
invoked everytime Vaurien proxies a request. That’s how you can impact the
behavior of the proxy. For instance, adding a delay or degrading the response
can be implemented in a behavior.

Both protocols and behaviors are plugins, allowing you to extend Vaurien
by adding new ones.

Last (but not least), Vaurien provides a couple of APIs you can use to
change the behavior of the proxy live. That’s handy when you are doing
functional tests against your server: you can for instance start to add
big delays and see how your web application reacts.

If you want to run and drive a Vaurien proxy from your code, the project
provides a few helpers for this.

For example, if you want to write a test that uses a Vaurien proxy,
you can write:

importunittestfromvaurienimportClient,start_proxy,stop_proxyclassMyTest(unittest.TestCase):defsetUp(self):self.proxy_pid=start_proxy(port=8080)deftearDown(self):stop_proxy(self.proxy_pid)deftest_one(self):client=Client()options={'inject':True}withclient.with_behavior('error',**options):# do something...pass# we're back to normal here

In this test, the proxy is started and stopped before and after the
test, and the Client class will let you drive its behavior.

Within the with block, the proxy will error out any call by using
the errors behavior, so you can verify that your application is
behaving as expected when it happens.

Vaurien comes with a handful of useful Behaviors and Protocols,
but you can create your own ones and plug them in a configuration file.

In fact, that’s the best way to create realistic issues: imagine you
have a very specific type of error on your LDAP server everytime your
infrastructure is under heavy load. You can reproduce this issue in your
behavior and make sure your web application behaves as it should.

Creating new behaviors and protocols is done by implementing classes with
specific signatures.

For example if you want to create a “super” behavior, you just have
to write a class with two special methods: on_before_handle and
on_after_handle.