Neat. Okay. Function pointers in C are powerful stuff -- powerful enough that C++ was originally a preprocessor, not a language unto itself.

Here's the thing I want to do. Imagine lights is part of a hierarchy. We have a light_controller_t for a whole building, and one for every room. We manage this hierarchy separately. When we turn off a parent light controller, we want all the child lights to be explicitly switched off.

This is easy. We implement the switch function to do its own work, and call the function for any child.

Suppose then that we implement a lot of services, controlling windows, radios, ovens, etc. There are a lot of function calls here. Some parts of the hierarchy won't have some services (e.g. first_floor.services[oven_controller]) but will have children that do (i.e. first_floor.children[kitchen].services[oven_controller]).

You can think of this as a tree of composed service nodes. Each node in the tree is a service that may or may not implement certain interfaces. The little services might be pretty simple.

But what do we do when a node doesn't implement a service? We want child nodes implementing the service to receive the invocation. Something like this (pseudocode):

function service_node.get_service(service_type):
if (self implements service_type)
return self as service_type
else
return service_proxy(self, service_type) as service_type
end
end
function service_proxy(parent, service_type):
self = new service_type()
for (method in service_type.methods)
method = function(args...):
for (child in parent.children)
child.method(args)
end
end
end
return self
end

Now, if you know something about C, you know that there isn't any easy way to do this. For starters, the language features aren't present: we've got a closure we can work around, but we've also got something that looks like varargs but can't be, because varargs in C are a dead end that can't generically "splat" back to arguments.

"But Justin," you say (startling me), "this is a clear case of using the wrong language for the task." And that is certainly true for the task I have described. Rejoice, Rubyists (or "Rubes," as they are popularly known):

I know. I know! Awesome! (Other languages can do this too; for example, PHP has __call, which does the same job but provides 96% less smug satisfaction).

The problem is that we're dealing with an isolated hypothetical, and while Ruby may or may not be 67 times slower than C, it sure isn't faster. The real-life version of the problem I'm describing has to do this resolution thousands of times per second at a speed that won't be more than a blip in profiling, or I'll have to rip it out. Dynamic Language Du Jour isn't the answer. I only need tiny subset of dynamic language features for some limited application, providing all the speed that comes from that lack of flexibility. An abomination is only abominable in proportion to its uselessness!

(As a digression, I believe the previous sentence should be in the foreword of any book about C++.)

But Ruby's original interpreter, MRI, is written in C. Everything is possible in C + inline asm for a sufficiently painful definition of "possible," right? Isn't this all just a bunch of registers, some instructions, a heap and a stack? All we want to do (I think) is forward an invocation, exactly as-is, to an address we provide at runtime. We can refresh our memory of calling convention and patch up a generic call ourselves using pointer math. No?

Obviously this only looks easy because this pseudocode pretends it's O.G., using terms like eip, while it totally glosses things like the for each loop, and how it knows sizeof(args), and its recent One Direction collaboration. We could leave extra stack space -- say, enough for ten doubles. This doesn't look impossible.

Here's an actual i386/x86_64 implementation (condensed), courtesy of StackOverflow user Coltox, who looks like he signed up for the express purpose of ninja-ing this answer at all the people who said it couldn't be done. It can be done, and to answer the implicit question, it's very painful.

The code that follows splats one varargs call to one function invocation, the signature of which is available at compile time.

(Bill: "I know, let's make a proxy to printf the return code from printf!" Carol: "Go home Bill, you're drunk.")

Still! Now we have a working solution! Let's hook it up!

...except that floats get passed in special float registers: don't pass any floats. Also, it doesn't do __cdecl or __fastcall. Also, it doesn't do ARM or PPC. Also, THIS IS MORE WORK. Remember our "manual" proxy?