From July 2008 until October 2010, I worked at Microsoft on the Xbox Incubation team. During this time, we incubated and productized Kinect - a new input device for the Xbox 360, which lets you control games (via full-body 3D tracking and voice recognition) without holding anything in your hands.

The Challenge

It turns out that skeletal tracking (i.e. motion capture) is a very difficult problem to solve robustly. Think of all the degrees of freedom (moveable joints) in the body - people can bend, twist, hide their arms against their body, put limbs behind their back, and so on. If you take a person and make their torso face the camera, then let them move each joint in their arms and legs into all possible pose combinations, at 15-degree steps (which is quite large), you get about 2^48 (or 281 trillion) possible poses - and that's just when the person is facing the camera! So now, multiply by at least 24, to handle upright torso rotation (in 15-degree steps), and even more to handle the non-upright torso positions. As you can see, the pose space is virtually infinite.

On top of that complexity, think of the vast variety in people's shapes, sizes, hair, clothing, and so on. And further, consider that when you look at a scene, it's very hard to know what to track, or what is human and what isn't. Finally, you have to do all of this computation in realtime (bwahaha...) - unlike mocap systems, you're not allowed to go 'offline' and compute the answer at leisure, or able to look backwards and forward in time.

As you can see, skeletal tracking is a rather difficult problem.

My role

I developed body-tracking code for Kinect (formerly Project Natal), which allows you to control the Xbox using only your body - no hand-held controller is required.

In my first two months at Microsoft, I wrote a powerful skeletal tracker that largely convinced the company (on the software front, at least) to greenlight Project Natal.

Over the course of the next 8 months I was the primary developer evolving this tracker, with support from a small team. During this time, we also integrated a new bodypart detection algorithm (Exemplar) from MSR, which further helped improve it. When we first showed Project Natal to the world, at E3 in June 2009, it was driven by this tracker.

After that, the Production team took over writing the shipping tracker, and I went back to R&D for about six months.

Then, in February 2010, I took a look at a major problem: the shipping tracker would not be able to properly detect or track people when sitting down. Within a few weeks, I had written an alternate tracker that solved this problem, and as a bonus, produced extremely accurate hand positions. This tracker did depend on MSR's (brilliant) algorithm, Exemplar, but I wrote all of the tracking code aside from that.

Over the next six months, I worked closely with the Production team to integrate this new tracker into the shipping codebase, and to further improve the accuracy to extremely high levels. This tracker ended up shipping with Kinect, but only powering the Dash - games had not yet had time to take advantage of it. (But, hopefully, we will see games powered by this tracker very soon!)

I also contributed to a large number of patents (17 I believe) during my time with the company.