Wednesday, April 6, 2011

HipHop talk

I just went to a talk by Facebook HipHop engineer Haiping Zhao about various systems issues FB has had. There was a discussion of language issues with data movement in distributed systems and how to get that right, but I'm going to focus on HipHop, Facebook's open source PHP to C++ translator, since that's more up my alley.

Thinking carefully about HipHop, it seems to make better tradeoffs for large web deployments than the (at this point) traditional JIT approach. HipHop essentially trades compilation time, code size, and maximum performance for performance predictability and immediate startup performance. Furthermore, static compilation is probably perceived as being less likely to have bugs than a JIT compilation to someone in charge of deploying the system, although I would argue against that perception.

The main argument against static compilation of dynamic languages is that you don't have enough information to generate efficient code, and that you need to be able to run it before compiling it. To address that, HipHop is uses feedback directed (aka profile guided) optimization, even though PGO has never caught on in the C/C++ world. Essentially, they translate all their PHP to instrumented C++ to get types of variables, and then run that on an instance with a model workload. The compile takes about 15 minutes to make a 1 GB binary (the gold linker helps), and the profiling takes about another 15 minutes. Then they re-do the compilation with the profile information. This allows optimizations like speculating the target of a call, and speculating the type of a variable as being int or string, the easy to optimize types.

I would argue that this approach won't have as good peak performance as a JIT, which only has to generate the specialized version of the code, and can do some more clever optimizations. Also, a JIT can always fallback to the interpreter for the general case. If your profiling ends up being wrong (unlikely), a JIT can respond appropriately.

The drawback of the JIT is that you don't know when or if your code is going to get compiled, and you have to wait for that to happen. Debugging performance problems with JITed code is tricky. On the other hand, there are many solid tools for profiling C++, and mapping that back to PHP doesn't sound too hard.

It's probably possible to do a little better in the static model by avoiding generating or loading code for slow paths that you feel will never actually be loaded. I imagine right now the generated code looks like:

The else branch is probably never taken. It seems like there should be techniques to either move the else branch far away from the hot code so that the cold code is never even faulted in from disk, or you could try to rig up a fallback to the interpreter, which I think would be much harder and not help your code size. On the other hand, branch predictors these days are good and moving code around is really only going to show up as instruction cache effects, which are pretty hard to predict. HipHop probably already uses __builtin_expect to tell g++ which branch it expects to take as well.

One other major thing I remembered during the talk is that optimizing PHP turns out to be much easier than Python. In Python, functions, classes, and globals in a module all share the same namespace. In PHP, variables start with a $ and functions do not. Short of eval, which HipHop forbids, there is no way to shadow a function without the compiler noticing that it might be shadowed. For example, this works:

That will set $bar to true. This is generally rare and easy to implement by falling back to the standard dynamic lookup.

OTOH, local variable lookup seems like it sucks because PHP supports something like the following:

$which = $a ? "foo" : "bar";$$which = "string";

Which will assign "string" to the variable $foo or $bar depending on whether $a is true or false. Hopefully the scoping rules restrict this to the scope of the local function, so you only need to worry about this feature in functions that use it.

Another feature of HipHop is that, once you forbid certain features like eval or using include to pull in generated or uploaded source from the filesystem, you get to do whole program compilation. One of the major optimization blockers even for C and C++ is that the compiler only gets to see a translation unit (TU) at a time. HipHop gets to see the whole program, so it makes it possible to do the checks I mentioned earlier, like making sure that there is only a single definition of a function 'foo' at top-level which is always present, so you can statically bind it at the call-site even if it's in a different source file.

Finally, Haiping mentioned that the massive binary produced is deployed to servers using BitTorrent, which surprisingly seems like exactly the right tool for the job. You could try to be more clever if you knew something about the network topology, or you could just use a general, well-tested tool and call it a day. =D

P.S. Hopefully none of this is considered FB confidential and Haiping doesn't get in any trouble. It was a very interesting talk!