Socket Timeouts in Ruby

Mar 15, 2009

One of Ruby's weaknesses is its poor networking performance. Much of that has to do with the net/http implementation, which uses Ruby's awful Timeout library. The issues with Timeout are well documented. SystemTimer provides a reliable alternative that also performs better.

However I started today wondering if there was a better way. Enabling timeouts has a huge performance hit on my memcache-client library and reducing the overhead would go a long way to making it perform safely and quickly. Since C programs need socket timeouts also, I figured there had to be a low-level alternative, and indeed there is: the SO_SNDTIMEO and SO_RCVTIMEO socket options. It's a bit involved to create a proper socket with these options but possible:

We use the low-level operations, Socket.new and connect rather than just TCPSocket.new(host, port) because otherwise we can't set the socket options before the connection is attempted; we want to ensure the connection attempt itself is timed out also.

We have to look up the host via DNS by hand as some systems (*cough*, OSX) can return either IPv6 or IPv4 addresses and the address family constant used in Socket.new must match the address used in the connect statement.

The setsockopt method takes a native C struct so we need to construct it using the Array#pack method.

Awesome. With raw socket timeouts, there is no performance impact! SystemTimer provides an excellent replacement for Timeout if you want to guarantee a ceiling on the time spent in an arbitrary block, but if you just need timeouts for low-level socket operations, nothing beats the operating system's native socket timeout support.

There is a caveat in the paragraph above: low-level socket operations. memcache-client uses three IO methods: read, write and gets. The first two are low-level and time out properly, but gets is built on the low-level read operation; it has to ignore the EAGAIN error in order to ensure it returns a full line of text. So we use a hybrid approach, read and write will use the raw socket timeouts and gets will use SystemTimer. It's not quite as fast as with no/raw timeouts but it's definitely an improvement:

So we've gone from 22 sec with Timeout to 15 sec with SystemTimer to 9 sec using raw socket timeouts where possible (Github commit). For my next trick, I figure I'll rewrite gets to use read so I can remove the need for SystemTimer and Timeout altogether.