Monday, January 3, 2011

Node.js performance? µhttpd performance!

There's been a lot of hoopla recently about node.js. Being an object-head, I've always liked the idea of reactive (event-driven) web servers, after all, that means it's just like a typical object, sitting there waiting for something to happen and then reacting to it.

Of course, there is also a significant body of research on this topic,
showing for example that user-level thread implementations tend to
get very similar performance to event-based servers. There is also
the issue that the purity of "no blocking APIs" is somewhat naive on
a modern Unix, because blocking on I/O can happen in lots of different
non-obvious places. At the very least, you may encounter a page-fault,
and this may even be desirable in order to use memory mapped files.

In those cases, the fact that you have purified all your APIs makes
no difference, you are still blocked on I/O, and if you've completely
foregone kernel threads like node.js appears to do, then your entire
server is now blocked!

Anyway, baving seen some interesting node.js benchmarking, I was obviously curious to see how
my little embedded Objective-C http-server based on the awesome GNU microhttp stacked up.

The baseline is a typical static serving test, where Apache
(out-of-the box configuration on Mac OS X client)
serves a small static file and the two app servers serve a small
static string.

Platform

# requests/sec

Static (via Apache)

6651.58

Node.js

5793.44

MPWHttp

8557.83

The sleep(2) example showed node.js at it's best. Here,
each requests sleeps for 2 seconds before returning a
small result.

Platform

# requests/sec

Static (via Apache)

-

Node.js

88.48

MPWHttp

47.04

The compute example is where MPWHTTP shines. The task is trivial,
just counting up from 1 to 10000000 (ten million).

Platform

# requests/sec

Static (via Apache)

-

Node.js

9.62

MPWHttp

7698.65

So counting up, libµhttp with MPWHTTP is almost a thousand times faster? The reason is of course that such a simple task is taken care of by
strength reduction in the
optimizer, which replaces the loop 10 million increments with a single addition of 10 million. Cheating? On a benchmark, probably, but
on the other hand that's the sort of
benefit you get from a good optimizing compiler.

To make the comparison a little bit more fair, I added an xor with
a randomly initialized value so that the optimizer could not remove
the loop (verified by varying the loop count).

Platform

# requests/sec

Static (via Apache)

-

Node.js

9.62

MPWHttp

222.9

So still around 20 times faster. It was also using both cores of my
Mac Book Pro, whereas node.js was limited to 1 core (so 10x single core
speed difference).

Cross-checking on my 8 core Mac Pro gave the following results:

Platform

# requests/sec

Static (via Apache)

-

Node.js

10.72

MPWHttp

1011.86

Due to utilzing the available cores, MPWHTTP/µhttp is now
100 times faster than node.js on the compute-bound task.

In conclusion, I think it is fair to say that node.js succeeds
admirably in a certain category of tasks: lots of concurrency,
lots of blocked I/O, very little computation, very little memory
use so we don't page fault. In more typical mixes with some
concurrency, some computation some I/O and a bit of memory use
(so chances of paging), a more balanced approach may be better.