Fun with Compiler Optimization Flags

I thought I'd comment on something in this article (which I blogged about yesterday) that has absolutely nothing to do with the actual Apple/Intel/IBM debate. Here's a snippet from the article:So why didn't Apple take any of these offers? Was it performance, as Jobs claimed in his keynote? Here's something that may blow your mind. When Apple compiles OS X on the 970, they use -Os. That's right: they optimize for size, not for performance. So even though Apple talked a lot of smack about having a first-class 64-bit RISC workstation chip under the hood of their towers, in the end they were more concerned about OS X's bulging memory requirements than they were about The Snappy(TM).
The above statement isn't entirely correct. Speed and size are not diametrically opposed in this case. When it comes to kernel and OS level code, compiling with -Os can actually produce faster code than say -O2 or -O3; especially on architectures with a relatively small amount of L2 cache. Basically, being able to keep core code in the cache gets you more performance than loop unrolling and the other fun that comes with -O{2|3}. The Fedora/Red Hat kernel team did a bunch of benchmarking on this, and the Fedora kernel is now compiled with -Os.
–jeremy