Event loops demystified

Sep 4, 2012 • Magnus Holm

This issue of Practicing Ruby was contributed by Magnus Holm (@judofyr),
a Ruby programmer from Norway. Magnus works on various open source
projects (including the Camping web framework),
and writes articles over at the timeless repository.

Boom, a server is up and running! Working in Ruby has some disadvantages, though: we
can handle only one connection at a time. We can also have only one server
running at a time. There’s no understatement in saying that these constraints
can be quite limiting.

There are several ways to improve this situation, but lately we’ve seen an
influx of event-driven solutions. Node.js is just an event-driven I/O-library
built on top of JavaScript. EventMachine has been a solid solution in the Ruby
world for several years. Python has Twisted, and Perl has so many that they even
have an abstraction around them.

Although these solutions might seem like silver bullets, there are subtle details that
you’ll have to think about. You can accomplish a lot by following simple rules
(“don’t block the thread”), but I always prefer to know precisely what I’m
dealing with. Besides, if doing regular I/O is so simple, why does
event-driven I/O have to be looked at as black magic?

To show that they are nothing to be afraid of, we are going to implement an
I/O event loop in this article. Yep, that’s right; we’ll capture the core
part of EventMachine/Node.js/Twisted in about 150 lines of Ruby. It won’t
be performant, it won’t be test-driven, and it won’t be solid, but it will
use the same concepts as in all of these great projects. We will start
by looking at a minimal chat server example and then discuss
how to build the infrastructure that supports it.

Obligatory chat server example

Because chat servers seem to be the event-driven equivalent of a
“hello world” program, we will keep with that tradition here. The
following example shows a trivial ChatServer object that uses
the IOLoop that we’ll discuss in this article:

If you don’t have the time to try out this code right now,
don’t worry: as long as you understand the basic idea behind it, you’ll be fine.
This chat server is here to serve as a practical example to help you
understand the code we’ll be discussing throughout this article.

Now that we have a place to start from, let’s build our event system.

Event handling

First of all we need, obviously, events! With no further ado:

moduleEventEmitterdef_callbacks@_callbacks||=Hash.new{|h,k|h[k]=[]}enddefon(type,&blk)_callbacks[type]<<blkselfenddefemit(type,*args)_callbacks[type].eachdo|blk|blk.call(*args)endendendclassHTTPServerincludeEventEmitterendserver=HTTPServer.newserver.on(:request)do|req,res|res.respond(200,'Content-Type'=>'text/html')res<<"Hello world!"res.closeend# When a new request comes in, the server will run:# server.emit(:request, req, res)

EventEmitter is a module that we can include in classes that can send and
receive events. In one sense, this is the most important part of our event
loop: it defines how we use and reason about events in the system. Modifying it
later will require changes all over the place. Although this particular
implementation is a bit more simple than what you’d expect from a real
library, it covers the fundamental ideas that are common to all
event-based systems.

The IO loop

Next, we need something to fire up these events. As you will see in
the following code, the general flow of an event loop is simple:
detect new events, run their associated callbacks, and then repeat
the whole process again.

classIOLoop# List of streams that this IO loop will handle.attr_reader:streamsdefinitialize@streams=[]end# Low-level API for adding a stream.def<<(stream)@streams<<streamstream.on(:close)do@streams.delete(stream)endend# Some useful helpers:defio(io)stream=Stream.new(io)self<<streamstreamenddefopen(file,*args)ioFile.open(file,*args)enddefconnect(host,port)ioTCPSocket.new(host,port)enddeflisten(host,port)server=Server.new(TCPServer.new(host,port))self<<serverserver.on(:accept)do|stream|self<<streamendserverend# Start the loop by calling #tick over and over again.defstart@running=truetickwhile@runningend# Stop/pause the event loop after the current tick.defstop@running=falseenddeftick@streams.eachdo|stream|stream.handle_readifstream.readable?stream.handle_writeifstream.writable?endendend

Notice here that IOLoop#start blocks everything until IOLoop#stop is called.
Everything after IOLoop#start will happen in callbacks, which means that the
control flow can be surprising. For example, consider the following code:

You might think that you’re writing data in step 2, but the
<< method actually just stores the data in a local buffer.
It’s not until the event loop has started (in step 5) that the data
actually gets sent. The IOLoop#start method triggers #tick to be run in a loop, which
delegates to Stream#handle_read and Stream#handle_write. These methods
are responsible for doing any necessary I/O operations and then triggering
events such as :data and :close, which you can see being used in steps 3 and 4. We’ll take a look at how Stream is implemented later, but for now
the main thing to take away from this example is that event-driven code
cannot be read in top-down fashion as if it were procedural code.

Studying the implementation of IOLoop should also reveal why it’s
so terrible to block inside a callback. For example, take a look at this
call graph:

By blocking inside the second callback, the I/O loop has to wait 5 seconds
before it’s able to call the rest of the callbacks. This wait is
obviously a bad thing, and it is important
to avoid such a situation when possible. Of course, nonblocking
callbacks are not enough—the event loop also needs to make use of nonblocking
I/O. Let’s go over that a bit more now.

IO events

At the most basic level, there are only two events for an IO object:

Readable: The IO is readable; data is waiting for us.

Writable: The IO is writable; we can write data.

These might sound a little confusing: how can a client know that the server
will send us data? It can’t. Readable doesn’t mean “the server will send us
data”; it means “the server has already sent us data.” In that case, the data
is handled by the kernel in your OS. Whenever you read from an IO object, you’re
actually just copying bytes from the kernel. If the receiver does not read
from IO, the kernel’s buffer will become full and the sender’s IO will
no longer be writable. The sender will then have to wait until the
receiver can catch up and free up the kernel’s buffer. This situation is
what makes nonblocking IO operations tricky to work with.

Because these low-level operations can be tedious to handle manually, the
goal of an I/O loop is to trigger some more usable events for application
programmers:

Data: A chunk of data was sent to us.

Close: The IO was closed.

Drain: We’ve sent all buffered outgoing data.

Accept: A new connection was opened (only for servers).

All of this functionality can be built on top of Ruby’s IO objects with
a bit of effort.

Working with the Ruby IO object

io.read reads until the IO is closed (e.g., end of file, server closes the
connection, etc.)

io.read(12) reads until it has received exactly 12 bytes.

io.readpartial(12) waits until the IO becomes readable, then it reads at
most 12 bytes. So if a server sends only 6 bytes, readpartial will return
those 6 bytes. If you had used read(12), it would wait until 6 more bytes were
sent.

io.read_nonblock(12) will read at most 12 bytes if the IO is readable. It
raises IO::WaitReadable if the IO is not readable.

For writing, there are two methods:

length=io.write(str)length=io.write_nonblock(str)

io.write writes the whole string to the IO, waiting until the IO becomes
writable if necessary. It returns the number of bytes written (which should
always be equal to the number of bytes in the original string).

io.write_nonblock writes as many bytes as possible until the IO becomes
nonwritable, returning the number of bytes written. It raises IO::WaitWritable
if the IO is not writable.

The challenge when both reading and writing in a nonblocking fashion is knowing
when it is possible to do so and when it is necessary to wait.

Getting real with IO.select

We need some mechanism for knowing when we can read or write to our
streams, but I’m not going to implement Stream#readable? or #writable?. It’s
a terrible solution to loop over every stream object in Ruby and check whether it’s
readable/writable over and over again. This is really just not a job for Ruby;
it’s too far away from the kernel.

Luckily, the kernel exposes ways to efficiently detect readable and writable
I/O streams. The simplest cross-platform method is called select(2)
and is available in Ruby as IO.select:

IO.select(read_array [, write_array [, error_array [, timeout]]])
Calls select(2) system call. It monitors supplied arrays of IO objects and waits
until one or more IO objects are ready for reading, ready for writing, or have
errors. It returns an array of those IO objects that need attention. It returns
nil if the optional timeout (in seconds) was supplied and has elapsed.

IO.select will block until some of our streams become readable or writable
and then return those streams. From there, it is up to those streams to do
the actual data processing work.

Handling streaming input and output

Now that we’ve used the Stream object in various examples, you may
already have an idea of what its responsibilities are. But let’s first take a look at how it is implemented:

classStream# We want to bind/emit events.includeEventEmitterdefinitialize(io)@io=io# Store outgoing data in this String.@writebuffer=""end# This tells IO.select what IO to use.defto_io;@ioenddef<<(chunk)# Append to buffer; #handle_write is doing the actual writing.@writebuffer<<chunkenddefhandle_readchunk=@io.read_nonblock(4096)emit(:data,chunk)rescueIO::WaitReadable# Oops, turned out the IO wasn't actually readable.rescueEOFError,Errno::ECONNRESET# IO was closedemit(:close)enddefhandle_writereturnif@writebuffer.empty?length=@io.write_nonblock(@writebuffer)# Remove the data that was successfully written.@writebuffer.slice!(0,length)# Emit "drain" event if there's nothing more to write.emit(:drain)if@writebuffer.empty?rescueIO::WaitWritablerescueEOFError,Errno::ECONNRESETemit(:close)endend

Stream is nothing more than a wrapper around a Ruby IO object that
abstracts away all the low-level details of reading and writing that were
discussed throughout this article. The Server object we make use of
in IOLoop#listen is implemented in a similar fashion but is focused
on accepting incoming connections instead:

classServerincludeEventEmitterdefinitialize(io)@io=ioenddefto_io;@ioenddefhandle_readsock=@io.accept_nonblockemit(:accept,Stream.new(sock))rescueIO::WaitReadableenddefhandle_write# do nothingendend

Now that you’ve studied how these low-level objects work, you should
be able to revisit the full source code for the Chat Server
example and understand exactly how it works. If you
can do that, you know how to build an evented I/O loop from scratch.

Conclusions

Although the basic ideas behind event-driven I/O systems are easy to understand,
there are many low-level details that complicate things. This article discussed some of these ideas, but there are many others that would need
to be considered if we were trying to build a real event library. Among
other things, we would need to consider the following problems:

Because our event loop does not implement timers, it is difficult to do
a number of important things. Even something as simple as keeping a
connection open for a set period of time can be painful without built-in
support for timers, so any serious event library must support them. It’s
worth pointing out that IO#select does accept a timeout parameter, and
it would be possible to make use of it fairly easily within this codebase.

The event loop shown in this article is susceptible to back pressure,
which occurs when data continues to be buffered infinitely even if it
has not been accepted for processing yet. Because our event loop
provides no mechanism for signaling that its buffers are full, incoming
data will accumulate and have a similar effect to a memory leak until
the connection is closed or the data is accepted.

The performance of select(2) is linear, which means that handling
10,000 streams will take 10,000x as long as handling a single stream.
Alternative solutions do exist at the kernel, but many are not
cross-platform and are not exposed to Ruby by default. If you have
high performance needs, you may want to look into the nio4r
project, which attempts to solve this problem in a clean way by
wrapping the libev library.

The challenges involved in getting the details right in event loops
are the real reason why tools like EventMachine and Node.js exist. These systems
allow application programmers to gain the benefits of event-driven I/O without
having to worry about too many subtle details. Still, knowing how they work under the hood
should help you make better use of these tools, and should also take away some
of the feeling that they are a kind of deep voodoo that you’ll never
comprehend. Event-driven I/O is perfectly understandable; it is just a bit
messy.

Practicing Ruby is proudly independent, open source, and advertising-free.This is a 100% reader-funded, reader-focused project that needs your support.