Inside the Bracket, part 7 - Runtime Machinations

Want to learn more about what’s really happening inside those square brackets? Read the entire Inside the Bracket series.

In our last episode, you saw a real application of Objective-C metadata introspection, looking at the information generated by the compiler and made available by the Objective-C runtime. Now it’s time for some fun - changing things!

You can manipulate classes at runtime. One way you can do this is by creating them entirely from your code, adding methods and instance variables. Once it’s built the new class is a first-class citizen as far as the Objective-C runtime is concerned. That’s a pretty rare thing to do, so instead I’ll just concentrate changing existing classes.

Remember @dynamic? It’s one of the directives that tells the compiler what to do with a property. If you tell the compiler “hey, this property is dynamic”, the compiler will do nothing else. It won’t make an instance variable and it won’t emit any accessor methods. Someone is responsible for hooking up those methods before they’re used. You can do that with the class_addMethod function.

Here’s our victim class - you can get the final version of the code from this gist:

@interfaceGlorn:NSObject@end@implementationGlorn@end

Not a whole lot there, just the basic support machinery you get from NSObject. You know what happens when you send an object an unexpected message:

name = [glorn larvoid];

This wonderful message:

-[Glorn larvoid]: unrecognized selector sent to instance 0x1021142b0

Easy enough to fix! First, make a function that takes the usual first two Objective-C parameters, self and the selector:

This is our IMP, which is a pointer to an Objective-C implementation. IMPs are the unit of currency for referencing code. When you ask a class for the code behind a method, you’ll get an IMP. When you change the code behind a method, you give it an IMP.

Get the class and add the method, passing the IMP, the selector, and a type signature:

Pretty easy - you just need to make a function, figure out its type signature, and then class_addMethod it. There is an easier way, too.

Give your kid mental blocks for Christmas

imp_implementationWithBlock is not in the official documentation (which hasn’t been updated since 2010), but is in the <objc/runtime.h> header. Instead of making a function, you give it a block that will be passed a self pointer (but no selector), and then any subsequent arguments:

The __bridge cast is to keep ARC happy because the function takes a void* for its block, presumably because there’s no common block type you’ll be passing. You get back a full-fledged IMP. implementationWithBlock wraps the block in a trampoline function and returns it to you. For more of the grungy details, check out Bill Bumgarner’s blog post.

Now you need to get the encoding of the block, which is is at least “v@” - returning void, and taking an object argument (self), and also add the type of NSUInteger. Because NSUInteger can be different sizes on different platforms, you can’t hardcode the string like I’m doing for the other two arguments, instead construct it at runtime. Then call class_addMethod, and free the buffer:

I’m using asprintf to try to break myself of the habit of “oh, I’ll just make a 512 byte character buffer on the stack and snprintf into that”, which can bite you later if that chunk of code gets used for something real.

Now you can call any arbitrary Glorn to get a repeated greeting:

[glorn greetingWithCount: 5];

And get the expected output. You’ll notice it captured name from the surrounding scope.

Resolve to be excellent to each other

Remember back in part 4 where I talked about some methods that get called when the Objective-C runtime can’t find a method for a given selector, and then used them to forward on to other objects? There’s another mechanism you can use in this circumstance. +resolveInstanceMethod: gives you the opportunity to add a method to a class before all hell breaks loose. It actually happens when someone asks “does this class respond to this selector.” You can look around, make decisions, add methods if need be to say “why yes. Yes it does!”

Try sending the Glorn some more messages:

[glorn keep];
[glorn watching];
[glorn theSkies];

This will croak at runtime unless you do something. For this class, make all unknown selectors that take no arguments create a new method, and have that new method print out the name of the selector. Otherwise, bail out and tell the runtime that you couldn’t resolve the instance method.

Notice that the block has captured the value of the selector that’s passed in. No need to copy the selector’s name to an NSString or a char buffer. That’s what makes imp_implementationWithBlock really powerful - you can take advantage of all of block’s various behaviors. Now, when you call these methods:

[glorn keep];
[glorn watching];
[glorn theSkies];

It prints out

you just called keep
you just called watching
you just called theSkies
you just called _doZombieMe

Whoa! What’s that last one? The first time I got my +resolveInstanceMethod: working, I saw that zombie-me method and laughed. I was totally not expecting it. It comes from NSObject'sdealloc, presumably for Zombie debugging support. Here’s the stack when it happens:

With ARC you can no longer casually send random messages to objects that the compiler hasn’t heard of, such as -larvoid. The compiler has to know the types involved so it can do proper memory management. This is just a warning in non-ARC land, so if you want to casually send messages, you can compile a file with -fobjc-no-arc.

For the code here, I kind of cheated and put a category on NSObject to say “ok, here’s what these methods really look like”. Similar to something in Part 4 I used without explanation, but was there to get the code to compile. Here’s the category for completeness:

The Grand Finale

Not only can you add methods at runtime, you can change the code that runs at runtime. You can patch your own stuff into existing classes. I know this is what everyone is here for. “Duuuuude, show me the swizzle!”

Like my previousbitaboutDTrace, I originally wanted to write about method swizzling because I got to use it to great effect in Krëndler, but you kind of had to know all this other stuff to appreciate what’s going on when you swizzle methods.

A Bit of History

Back on the old, old days, when memory was measured in K and processor clock speeds were measured in units of 0.001 gigahertz, the original Mac used a pretty clever scheme for calling functions inside of the Mac Toolbox. The Toolbox was a huge (64K) library of code that made every Mac a Mac. Needed to allocate memory? Wanted to create a window and draw into it? You called into the toolbox.

Rather than using jump instructions (at the cost of four to six bytes) to call a function like GetNextEvent(), the original Mac used “A traps”, two-byte instructions that started with a leading 1010 nybble, which is 0xA. When the processor hit one of these instructions, the machine would stop, look in a table in memory for the address to jump to, and then resume execution there, kind of like an interrupt vector. You called the toolbox a lot, and saving two or four bytes on each call is a major win just in program compactness. But it also came with added flexibility.

Here’s the normal use case: GetNextEvent is trap A970 (the nostalgic can find a complete list in part 3B of this page). The processor sees A970 in the instruction stream, stops, indexes into the lookup table, gets the address in ROM to jump to, and then jumps.

This table lives in RAM. There was no memory protection on the original Mac, so we could write anything, anywhere. (Let me tell you that was fun, for limited values of “fun”). What would happen if we saved that address from the table, stashed it elsewhere in memory, and put a pointer to our own code in the table? You get this:

Now when GetNextEvent’s A-Trap is encountered, the system stops, looks in the table, finds the address of WhackEvents, and calls it. Poof! We’ve just added a global event filter. It could examine the event returned from the Toolbox and decide to return it untouched. Or maybe it would modify the event. Or perhaps just drop the event on the floor entirely and wait for another one. This is what was known as “trap patching”. Trap patching was also used by Apple to fix bugs in the ROM - the system loads a some code into RAM and twiddles with the trap table to point to the fix.

Trap patching was a rite of passage for all Mac programmers to do at least once. My first experimental patch was FrameRect, A8A1, which I figured was called enough to actually be hit. I made it SysBeep to say “I’m here!” I quickly realized just how often FrameRect gets called, even in a do-nothing program. Man that was loud! I had to power off the machine to get control back.

This is what method swizzling does, but for Objective-C. You can stick your code inside someone else’s class and alter the course of program flow.

Trap-patching is a very powerful technique, as is method swizzling. But it’s also very fragile. An OS update might render your patch obsolete. It may change the way something behaves so that all the assumptions you made about the environment of the call (state of registers, variables on the stack, other functions on the stack, and so on) might be wrong. Your now-broken code might be nice enough to instantly lock up the machine, or it might be mean enough to scribble a couple of bytes into the user’s data, corrupting it for a later point in time. Trap-patching was fraught with peril, and if you didn’t have to do it to work around a fatal bug, it was best if you never even heard of the concept.

Swizzling

Before getting into Swizzling, I want to make that same admonition. Don’t use it for real code in real programs used by real people. The risks of stuff breaking is rarely worth it. So far in my career I’ve used swizzling for writing some unit tests (say wanting to patch out part of some godforsaken singleton), but have never shipped any.

Now, programming competitions where you’re showing off to your friends, or if your product is fundamentally an evil hack and your users expect it to break under minor OS updates, then it’s fine. It can also be a useful exploration and debugging tool.

You are finite. Zathras is finite. This is wrong tool.

The canonical method swizzling function is kind of subtle. You’ll first see the simple (and wrong) way to do it, see why it’s wrong, and then see the right way.

To start off with, here are some sample classes. The complete code can be found at this gist. The base class has two utility methods, and a cover method that invokes those two:

This is polymorphism 101. doStuff is in the BaseClass, but it’s causing code to be run in FirstBegat because the Objective-C runtime machinery is poking around FirstBegat’s class looking in its code pile for things stored under the name “hornswoggle” and “bamboozle”. (If this didn’t make any sense, it probably means you skipped the preliminaries.)

To make life interesting, here’s a second class. It subclasses FirstBegat, but only overrides hornswoggle. It gets the default bamboozle.

The Wrong Way

One of the fundamental utilities for swizzling methods is method_exchangeImplementations. You first get two Method objects from the class by using class_getInstanceMethod and then pass them to method_exchangeImplementations. Their implementations, their IMPs, will be swapped. Exchanging hornswoggle and bamboozle for FirstBegat would cause [first hornswoggle] to print out “bamboozle”, and vice versa. Here’s the first crack at a Swizzling function:

You call this with a class and two selectors. The methods are looked up, then swapped assuming they both exist. Their signatures should be the same, that is take the same kinds of arguments and return the same values. There’s no error checking, so if you exchange @selector(description) with @selector(initWithBitmapDataPlanes:pixelsWide:pixelsHigh:bitsPerSample:samplesPerPixel:hasAlpha:isPlanar:colorSpaceName:bitmapFormat:bytesPerRow:bitsPerPixel:), you get what you deserve.

So, let’s hack FirstBegat. Remember that it implements both hornswoggle and bamboozle. (This is a very important detail.) The hack looks like this:

Even though we’re swizzling the methods badly, this is the coding technique I use even when swizzling them correctly. This adds a new method, does some work, and then calls the original method. Enjoy this. Revel in the horror. What’s going on?

First, the machinery. SwizzleMethodBadly, and soon enough SwizzleMethod, exchanges two methods in a single class. If you want to whack FirstBegat’s class, you need to have a method that’s inside of FirstBegat’s code pile. A category is a great way to do that. Declare the hck_hijackMethods method (and remember to be safe and always prefix your categories.):

All it does is swap the selector bamboozle with something with a (more) bizarre name. What’s up with that name? The dollar sign is a legal identifier character in C, along with with the usual alphanumerics and the underscore. Not many people know about the dollar sign, and those that do typically use it to indicate that something weird or hacky is coming up. In this case, the method name $hackFirstBegat_Bamboozle tells me that a) “$” something gross is coming up and it’s most likely a swizzle, b) the class that’s being attacked, and c) some indication of the method being changed. That way it’s pretty obvious that this method is used in a swizzling situation and what it does. You could just as easily called it fluffyBunny and things would work the same.

Now for the method that gets swizzled. This is what will be invoked when someone sends the -bamboozle message. The first two lines are straightforward - the method signature and the work being done. In this case, printing a message:

The second method calls the original implementation. This code isn’t replacing the original code, just augmenting it.

[self $hackFirstBegat_Hornswoggle];
}

But wait, you say. Isn’t this a recursive call? We’re inside of $hackFirstBegat_Hornswoggle, and now you’re calling itself again!

Remember that the methods have been exchanged. The class’s selector-method dictionary now looks like this:

Remember how things work : [self $hackFirstBegat_Hornswoggle] tells the Objective-C runtime to access self’s class, dig into the code map, look up $hackFirstBegat_Hornswoggle, find the code there, and jump to it. Thanks to method_exchangeImplementations, this selector is currently pointing to the old code, and so we’re calling the original code here, even though it’s been filed under our new name. Yeah, it looks pretty weird, but that’s how it works.

And speaking of “works”, it seems to work. After swizzling the methods, and telling first to doStuff, you can see the new code being run:

Lurking Software Defects

There’s just one problem. class_getInstanceMethod does its job too well. If this function can’t find the method in the given class, it’ll go up the inheritance chain and find and the first implementation it can. If you exchangeImplementations with that Method, you’ve now put your code into an unexpected class.

Imagine intending to swizzle a method on NSButton intending to only affect buttons, but instead you whack NSView. Now every single view has your swizzle code, potentially leading to hilarious results. This is something that could bite you with OS updates. Say Apple did a refactoring of a class you’ve swizzled and removed the need for that class to override a method. Your working swizzle, which was using this buggy swizzle implementation, is now broken.

Want proof? Remember that SecondBegat overrides hornswoggle, but notbamboozle. If we SwizzleBadly SecondBegat’s bamboozle, it will whack FirstBegat’s version. Here’s the swizzling:

newMethod will be code that fer-sher lives in SecondBegat because we’re passing in the selector we’re swapping with ($hackSecond). originalMethod could be a method that lives in SecondBegat. It might live in FirstBegat, or even BaseClass or NSObject.

How to tell? class_addMethod fails if there’s already a method for that selector in its code pile. So, try adding the $hackSecond method to SecondBegat’s class under the bamboozle selector:

If this succeeds, it means that SecondBegat did not have a bamboozle of its own, but it does now (pointing to $hackSecond). If it fails, nothing happens, and we know that SecondBegat did, in fact, have its own bamboozle.

So if we added the method, that means that now SecondBegat has a new -bamboozle that points to the $hackSecond method. To complete the swizzle, $hackSecond needs to point to the original y code. class_replaceMethod says “hey class! Whatever you have this selector pointing to in your code pile, replace it with this method”. In this case, replacing $hackSecond with the original -bamboozle. Done. This is the code path that this particular swizzle will take.

Now for the other case, where the method add failed because the class already had a -bamboozle method. Just exchange the two:

With great power use comes great electric bills

I hope I’ve hammered the point home that this is powerful, potentially dangerous tool. A Sawzall is a powerful tool, but it could be dangerous if you accidentally cut a conduit apart in your home. But there are some legitimate uses out here in application land. There might be a really bad bug in a new version of the toolkit that has no other workaround. You might want to change the behavior of a helper class during a unit test. You might be tracking down a bug and be wondering “what data is really flowing through this Cocoa method” - swizzle in a spy to print out what it’s seeing, maybe modify it, and then send control on to the existing code.

Thanks for all the fish

This wraps up my tour of the low-level goodies that exist in the Objective-C runtime. I didn’t cover everything - there’s just not enough time. Also, a reason a lot of this exists is to bridge to other languages, which is a pretty esoteric topic even for me. There are practical uses for some of this stuff, and I’m a believer that knowledge is power, no matter how skanky the hacks are to obtain that knowledge. Just be professional in what you ship to paying customers, whether they’re paying in money or their time.

_18 months ago MarkD started writing for the Big Nerd Ranch blog. 70 some-odd articles (and some are pretty odd) and 110,000 words later, we’re sending him away for a well-deserved blogging vacation. Don’t worry, he’ll be back in the fall.
_