I'm looking for tools and approaches to determining what parts of of my Cocoa and Cocoa-Touch programs are most contributing the the final binary image size, and ways to help reduce it. I'm not looking for a "magic bullet" compiler flag. I'm looking for profiling techniques for evaluating and reducing image size waste in the same vein as Shark and Instruments help for run-time evaluation.

A first-order approximation may be the size of the .o's, but how trustworthy is this in terms of final image size after optimizations and dead-code stripping? If I add up all the .o's they are much larger than my final image, so clearly the linker is already helping me out significantly. But this means the size of the .o's may not be a useful measure.

Where do others look to reduce image size without undermining code maintainability?

2 Answers
2

Apple has some awesome docs on Code Size Performance Guidelines, almost all of which applies to this question in some form. There are even tips for pedantic approaches like manually ordering symbols in the binary if desired. :-)

I'm totally a fan of simple, slim code and minimizing disk/memory footprint. Premature optimization is always a bad idea, but consistent housekeeping can be a good way to prevent cruft from accumulating. Unfortunately, I don't know of an automated way to profile code sizes, but several tools exist that can help provide specific insight.

Binary Image Size

Object files aren't as terrible an approximation as you'd guess. One reason the sum is smaller than the parts is because the code is all joined together with a single header. Although the percentages won't be precise, the biggest object files are the biggest parts of the linked binary.

For understanding the raw length of each particular method in an object file, you could use /usr/bin/otool to print out the assembly code, punctuated by Objective-C method names:

$ otool -tV MyClass.o

I look for long stretches of assembly that correspond to relatively short or simple methods and examine whether the code can be simplified or removed entirely.

In addition to otool, I've found that /usr/bin/size can be quite useful, since it breaks up segments and sections hierarchically and shows you the size of each, both for object files and compiled binaries. For example:

This is a "bigger picture" view, although it usually reinforces that __TEXT __text is often one of the largest in the file, and hence a good place to start pruning.

Dead Code Identification

Nobody really wants their binary to be littered with code that is never used. In a dynamic and loosely-coupled language like Objective-C, it can be difficult or impossible to statically determine whether specific code is "used" or not. Even if a class is instantiated or a method is called, tracing code paths (both theoretical and actual) can be a headache. I use a few tricks to help with this.

For static analysis, I strongly recommend the Clang Static Analyzer (which is happily built into Xcode 3.2 on Snow Leopard). Among all its other virtues, this tool can trace code paths an identify chunks of code that cannot possibly be executed, and should either be removed or the surrounding code should be fixed so that it can be called.

For dynamic analysis, I use gcov (with unit testing) to identify which code is actually executed. Coverage reports (read with something like CoverStory) reveal un-executed code, which — coupled with manual examination and testing — can help identify code that may be dead. You do have to tweak some setting and run gcov manually on your binaries. I used this blog post to get started.

In practice, it's uncommon for dead code to be a large enough proportion of the code to make a substantial difference in binary size or load time, but dead code certainly complicates maintenance, and it's best to get rid of it if you can.

Symbol Visibility

Reducing symbol visibility may seem like a strange recommendation, but it makes things much easier for dyld (the linker that loads programs at runtime) and enables the compiler to perform better optimizations. Consider hiding global variables (that aren't declared as static) etc. by prefixing them with a "hidden" attribute, or enabling "Symbols Hidden by Default" in Xcode and explicitly making symbols visible. I use the following macros:

Although reducing symbol visibility is unlikely to directly reduce the size of your binary, the compiler may be able to make improvements it couldn't otherwise. Also, you stand to reduce accidental dependencies on symbols you didn't intend to expose.

Analyzing Library Dependencies and Loading

In addition to raw binary size, it can often be quite helpful to analyze which dynamic libraries you link to, and eliminate those that might be unnecessary, particularly less-commonly-used frameworks that may not be loaded yet. (You can also see this from Xcode too, but with complex projects, sometimes things slip through, so this also makes for a handy sanity check after building.) Again, otool to the rescue...

$ otool -L MyClass.o

Another (extremely verbose) alternative is to have dyld print loaded libraries, like so (from Terminal):

This shows exactly what is being loaded, including dependencies of the libraries your code links against.

Analyzing Launch Performance

Usually, what you really care about is whether the code size and library dependencies are truly affecting launch time. Setting this environment variable will cause dyld to report load statistics, which can really help pinpoint how time was spent on load:

On Leopard and later, you'll notice entries about "dyld shared cache". Basically, the dynamic linker creates a consolidated "super library" composed of the most frequently-used dynamic libraries. It is mentioned in this Apple documentation, and the behavior can be altered with the DYLD_SHARED_REGION and DYLD_NO_FIX_PREBINDING environment variables, similar to above. See man dyld for details.

Thanks a lot. This was exactly the kinds of things I was looking for. All tools I'm familiar with, but didn't realize their additional options. I've dug into the symbols visibility issues before and it definitely makes a different. Big danger there: it can screw up C++ exception handling if you throw in one .o and catch in another (and it's very hard to detect that this is happening, and you won't get any warnings that things won't work). But for pure-ObjC code it does seem promising. Thanks again.
–
Rob NapierJun 18 '09 at 15:36

No problem at all, I'm happy to share. I think code profiling should be more commonplace, not a black art practiced only in secret by code sorcerers. ;-) I don't use C++, so I wasn't aware of the problems with exceptions there — thanks for sharing!
–
Quinn TaylorJun 18 '09 at 17:04

You might want to look at otool. Specifically, you probably want to use the -l flag which displays all the load commands (a.k.a. the sections and segments) that make up your binary.

Having said all that, you would usually find that the resources are more significant than the code you write, so I'm wondering what problem you’ve encountered that you’re trying to solve. Our applications have a fair bit of code yet are still only a few MB. Maybe you’re statically linking to some big libraries—I don't know.

If most of your code is Objective-C, very little of it will be removed with dead-code stripping (for obvious reasons), so that won’t make much difference.

What will make a difference is the debug information which will be substantial. Your object files will include this, but you'd typically have it stored in a separate dSYM bundle when you link it so it won't be included in the final binary (or at least this is what you should be doing).

Your code will be in the __TEXT, __text segment/section.

I'm pretty sure the linker will coalesce equivalent strings so the total will be less than the sum of the parts for these sections, but, I guess, typically not by much.

I would also expect your relocation and symbols sections to be less than the sum of the parts. You should strip your linked binary of unneeded symbols to save space (which isn't the same as stripping debug information). See the "Strip Linked Product" setting in Xcode.

One other thing to remember is that your linked binary will be a FAT binary, whereas the object files usually aren’t.

Yeah, I'm certain otool has the secrets in there somewhere... The goal of the exercise is the same as for performance tuning. You don't wait until the program is crawling to start asking "how could I improve it without sacrificing maintainability." Modern binaries can be surprisingly large, and I'm looking for what optimization tools are available for when they're needed (iPhone makes this more urgent, but we've gotten sloppy sometimes on Mac). Good reminder about the fat binary. Agreed about dead-code stripping, which is why I'm contemplating how to discover cruft that is never called.
–
Rob NapierJun 7 '09 at 0:24