Hijacking for Fun and Profit

Last fall I took a week off to escape from the world. I assigned myself three tasks for that time: disappear and recover from a number of stressful deadlines, clean up my office area (affectionately known as “my cage”), and learn about CocoaTouch gesture recognizers. I ended writing a little tool for playing around with gesture recognizers that I called GestureLab.

Being an unrepentant fan of caveman debugging, one of the things I wanted for GestureLab was real-time logging of messages printed from code. If I’m debugging a gesture recognizer, I’d like to be able to add some log statements and see them appear on the iPad at the same time I’m tapping and dragging:

That turned out to be easier than I thought. It’s not entirely straightforward because you need to use calls from the Unix layer, Core Foundation, and Cocoa / CocoaTouch to make it all work.

The Problem

Things like printf() and NSLog() write bytes to two well-known file streams, standard out, and standard error. There are global variables stdout and stderr for the standard C library output streams, represented by FILE structures. Inside of those FILEs are the file descriptors, the fundamental Unix I/O mechanism, commonly abbreviated as “fd”. You read() and write() through file descriptors.

A file descriptor is just an integer. fd zero is the raw standard-in file descriptor, fd 1 the raw standard-out file descriptor, and fd 2 is the raw standard-error file descriptor. Ultimately, every printf() and NSLog() will eventually be sending bytes to fd 1 or fd 2, which then get picked up by Xcode when you’re debugging, or sent to the console or terminal, like so:

Wouldn’t it be nice to hijack those and send strings into your own code to do with as you want? You would need to do something like this:

dup() all the things

Redirecting these byte streams is going to require playing games with the file descriptors. Specifically, replacing what’s currently living at file descriptor 1 with some other file descriptor. Luckily the unix layer has two calls specifically geared for file descriptor sleight of hand : dup() and dup2().

dup() takes a file descriptor and duplicates it, returning a new file descriptor (a different number) that points to the same file. It actually points to the same kernel data structures, so things like read position and the close-on-exec flag are the same. Writing to file descriptor you passed to dup() behaves exactly the same as writing to the return value from dup().

Imagine this is the table of open files:

File descriptors 0, 1, and 2 point to the standard I/O streams, fd 4 points to an open (and presumably playing) QuickTime movie, and fd 5 is pointing to a log file. There’s nothing on fd 3, presumably a file that was opened and subsequently closed.

If you dup(1), which is standard out’s file descriptor, dup() will return a new file descriptor, 3 in this case. This number is guaranteed to be the lowest available fd slot, helping keep the set of file descriptors as compact as possible.

Now fd 3 is pointing to standard out too.

Why is this useful? You’re going to use this to make a backup of the file descriptor that’s going to get hijacked, so you can remove the hijacking later if you want. You can also write bytes to the dup‘d file descriptor, to mirror the output to several places.

dup2 Brute?

dup2() is the other tool. dup2() takes two file descriptors, which I like to call the file and the victim. A call like dup2(file, victim) will first close the victim. Then file is dup’d, with the new file descriptor being the same as victim’s.

Here’s a starting state with some open files, and you want bytes written to standard error to go to error-log.txt instead:

To perform that redirection, you would do something like:

dup2 (3, 2);

This takes the file (fd 3, the log file), and the victim (fd 2, the current standard error stream), closes victim, and points its file descriptor to the log file too.

And poof! Standard error has been redirected.

Why is this useful? If you had some mechanism where you could take bytes written to a file descriptor, and read them on some other file descriptor, you could dup2() it into the position you want to hijack. Then, any bytes written by the toolkit to fd 2 (in this case) could be read by your code and put into the textview.

Put that in your pipe() and read it

So, is there some kind of mechanism where you could take bytes written to one file descriptor and read them from another? You bet there is! It’s a pipe, the fundamental IPC mechanism used when you chain commands together in the shell. If you ran a command like

$ ls *.txt | wc -l

Your shell will launch the ls program and the wc program and connect them with a pipe. Bytes written out by ls will be read by wc. The pipe() system call gives you one of these pipes. There’s no reason you can’t use a pipe entirely within your own program.

You supply pipe() with an array of two ints. pipe() fills in that array with two file descriptors. Any bytes written to file descriptor 1 can be read by file descriptor zero. It’s unidirectional, so data flows in one direction.

So, the call:

int fds[2];
int result = pipe (fds);

Will fill in the fds array like so:

Anything written to the file descriptor at array index 1 (fd 25) can be read at the file descriptor (24) at array index 0.

_Why is this useful? _ You can dup2() the write-side of the pipe to the fd you want to hijack. Now any bytes that are written to the hijacked file descriptor will actually be going into this pipe. Your code can read from the read-fd of the pipe and get the log statements.

At this point, with pipe() and dup2(), the file descriptor can be successfully hijacked.

Run/L-O-O-P

There’s just one more missing piece, and then fundamentally you’re done. That’s getting the bytes from Unix-file-descriptor land into Cloud-Cocoa-Land. You can do that adding the read-side of the pipe to the run loop. The write-side of the pipe is dup2()‘d to the fd we’re hijacking, and the read side goes into the run loop so the application can drain the bytes from it.

I go into much more detail with integrating Unix file descriptors into the Cocoa / Core Foundation run loop model in AMOSXP(3), so check that out if you want the details, or take a look in the Hijack project code. But the short-form is: you wrap a native Unix file descriptor with a CFSocketRef and supply a callback function. You wrap the CFSocketRef in a CFRunLoopSourceRef and plug it into the runloop.

When there’s any bytes available to read on the file descriptor, Core Foundation will read those bytes for you, packaging them up into a CFDataRef (which is toll-free bridged with NSData), and calling the callback function. That function can then take the data, turn it into an NSString, and append it to the text view. If you want to replicate the logging out to its original destination (say to Xcode), you would write those bytes to your saved-off, dup()‘d file descriptor.

Why is this useful? It’s the glue that connects the pipe to the UI classes.

Putting it all together

dup() standard out’s file descriptor so it doesn’t get closed. Pretend the next available file descriptor is 23.

existingStdOut = dup (stdout_fd);

Now make a pipe. Anything written to fd 25 can be read from fd 24:

int fds[2];
pipe (fds);

Now hijack the standard out file descriptor:

dup2 (fds[kWriteSide], stdout_fd);

And then add the file descriptor to the runloop:

[self startMonitoringFileDescriptor: fds[kReadSide]];

The Actual Code

So, want some actual code?

I can never remember which of the pipe file descriptors is the read side or the write side, so make an enum with a human-readable name:

enum { kReadSide, kWriteSide };

The Hijacker object has some instance variables and properties for holding on to stuff, declared in the class extension (but it doesn’t really matter where, because you’ve got three places to choose from):

@interfaceXXFdHijacker(){int_pipe[2];// populated by pipe()
CFRunLoopSourceRef_monitorRunLoopSource;// Notifies us of activity on the pipe.
}@property(assign,nonatomic)intfileDescriptor;// The fd we're hijacking
@property(assign,nonatomic)intoldFileDescriptor;// The original fd, for unhijacking
@property(assign,nonatomic)BOOLhijacking;// Are we hijacking or replicating?
@property(assign,nonatomic)BOOLreplicating;@end// extension

The pipe and runloop source are instance variables because they’re harder to express as properties. The other stuff is easy to represent as properties.

The method to start hijacking is called, cleverly enough, -startHijacking. After a quick short-circuit check, duplicates the file descriptor:

- (void) startHijacking {
if (self.hijacking) return;
// Unix API is of the "return bad value, set errno" flavor.
int result;
// Make a copy of the file descriptor. The dup2 will close it,
// but we want it to stick around for restoration and replication.
self.oldFileDescriptor = dup (self.fileDescriptor);
if (self.oldFileDescriptor == -1) {
assert (!"could not dup our fd");
return;
}

Decent error checking is left as an exercise (because it’s very dependent on how your app handles that kind of stuff), but here it’ll cause death for the program if it can’t actually do any of this stuff. (That assert(!"string") is a quick way to kill the app and also get a printout. The ! turns the pointer value of the string to zero, which causes assert to fail and kill the app, and assert prints out the expression that killed it, which is the C string.)

Now make the pipe:

// Make the pipe. Anchor one end of the pipe where the original fd is.
// The other end will go to a runloop source so we can find bytes
// written to it.
result = pipe (_pipe);
if (result == -1) {
assert (!"could not make a pipe for standard out");
return;
}

And then do the hijack:

// Replace the file descriptor with one part
// (the writing side) of the pipe.
result = dup2 (_pipe[kWriteSide], self.fileDescriptor);
if (result == -1) {
assert (!"could not dup2 our fd");
return;
}

So, why is this useful?

I think it’s kind of cool being able to grovel around in the Unix API calls like dup and pipe, and connect that to user-land in a form that’s actually useful. GestureLab keeps track of lines of text when they’re logged, and adds and removes them from the text field as the time scrubber is moved back and forth, letting you replay time as slowly or as quickly as you need.

I also used log-hijacking in a support tool for an app that stores stuff in an iCloud container shared by a suite of apps. A helper app would let someone browse the iCloud container, with all logging redirected to a text view so that any caveman debugging in the model objects or the iCloud glue code would be visible. I gave this tool to some of my non-technical beta testers, and when they saw something weird, a quick screen shot would include the last 20 or so lines of logging, which was often all I needed to track down the problem.