Last week I ran into a nasty little problem while implementing an application with soft real-time requirements. I was aiming at 1 ms or less for a TCP-based request-response roundtrip on a local network. Should be trivial, but why did my tests indicate that I wasn’t even getting close?

The scenario was simple: A server (my part) gets a request from a client. Before it can answer, it has to ask a backend system for some information. The backend system listens on a TCP port and answers in a query-response fashion (processing time is far below 1 ms). Both query and response typically fit in a single TCP segment. The response may sometimes be larger, which is one reason why TCP was an adequate choice for the backend system.

As a first optimization I used a TCP connection pool to get around TCP’s three-way handshake (SYN, SYN ACK, ACK) and thus cutting down latency. However, tracing showed response times around 40 ms on the local network. Actually worse than without the connection pool! Admittedly, the conventional use for connection pools is to circumvent TCP’s slow start mechanism. But still, something was very wrong.

I quickly guessed that this was a buffering issue with the TCP stack or maybe Java’s socket wrapper. Since calling flush() on the socket didn’t seem to help either, I turned to the Socket FAQ. As usual, the FAQ provides you with two pieces of information: a) the solution that works well in practice and b) the impression that things are more complicated than one would guess.

The relevant section of the FAQ is "2.11. How can I force a socket to send the data in its buffer?". It starts discouragingly with a quote from Richard Stevens himself:

"You can’t force it. Period. TCP makes up its own mind as to when it can send data. Now, normally when you call write() on a TCP socket, TCP will indeed send a segment, but there’s no guarantee and no way to force this."

Sounds bad. But wait, wasn’t there a PUSH flag in the TCP header for speeding things up a bit? From RFC 793:

"A sending TCP is allowed to collect data from the sending user and to send that data in segments at its own convenience, until the push function is signaled, then it must send all unsent data. When a receiving TCP sees the PUSH flag, it must not wait for more data from the sending TCP before passing the data to the receiving process."

Too bad: According to the FAQ, the socket API gives you no way to trigger this magic push functionality! It’s not connected to flush(), as one might have guessed.

So, game over? Fortunately not. One thing you can do is disabling Nagle’s Algorithm by setting TCP_NODELAY on the socket (in Java, that’s Socket.setNoDelay(true)). Nagle’s Algorithm is basically a buffering facility inside the kernel’s TCP stack. It makes sure data from multiple write() doesn’t necessarily result in sending TCP segments immediately. Instead, data is collected, reducing network overhead for applications where sometimes just a single byte of user data is sent over the network (telnet-style applications come to mind). Unfortunately, it gets in our way causing the 40 ms delays mentioned earlier.

As a result, thanks to TCP_NODELAY response times to the backend system dropped below the 1 ms threshold. Great!

The moral of this story: If your TCP-based real-time application suffers from bad latency, try setting TCP_NODELAY. And also, even if you know TCP fairly well, you still don’t get the full picture unless you also know how your operating system’s TCP stack and APIs work.

13 Responses to Using TCP for Low-Latency Applications

TCP uses its own flow control mechanism and the throughput is related to the congestion window, network condition (i.e, congestions), and error rates. Some TCP protocols do provide other mechanisms than slow start, but be careful about the fairness among TCP connections. That is to say, you don’t want one connection takes over most of the bandwidth. Try to tune your TCP congestion window if possible.

Also, many network stacks are have highly optimized TCP stacks, since most heavy-weight applications / protocols are TCP-based. Not always so with UDP. You’d be surprised how often TCP yields better performance than UDP.

I’m working on a Java based TCP game server in which I’ve achieved far below 1ms latency times. In fact, I had to use nanoTime because milliseconds wasn’t fine enough to measure the difference (kept measuring 0), so microseconds were used instead.

In addition to setTcpNoDelay, there’s also setPerformancePreferences(connectionTime, latency, bandwidth).

If you pool your connections, set connectionTime and bandwidth to lower values, and set latency to the highest.

I had a similar case with real time data. I’m running a TCP server on a android tablet, and the response was around 500ms to 2000ms. Killing me when i need 100ms or less. Turning of Nagle’s algorithm improved the time to ~25 ms. Saved my current application.