Primary Navigation

Fusing the Kinect, AutoCAD and Kinect Fusion (with some C#)

Description

Given the release of Kinect for Windows SDK v1.7 and the inclusion Kinect Fusion, the next two posts are about, well, Kinect Fusion!

Please join me in welcoming back Kean Walmsley and his combining the Kinect and AutoCAD (and with Fusion, that seems like a match made in... well, maybe after reading this, not Heaven, but pretty close!)

OK, here goes: my first (public) attempt at integrating the brand new Kinect Fusion functionality – made available this week in v1.7 of Microsoft’s Kinect for Windows SDK – into AutoCAD. There are still a few quirks, so I dare say I’ll be posting an update in due course.

As mentioned in the last post, I’ve been working on this for some time but can now show it publicly, as the required SDK capabilities have now been published. As part of this effort, I’ve gone ahead and made sure the other Kinect samples I’ve written for AutoCAD work with this version of the SDK: all can be found here.

Much of the work was clearly to integrate the appropriate Kinect API calls into an AutoCAD-resident jig, much in the way we’ve seen before when displaying/importing a single depth frame. Kinect Fusion introduces the idea of a reconstruction volume that gets gradually populated with data streamed in from a Kinect sensor, building up an underlying mesh that represents the 3D model.

AutoCAD is OK with meshes to a certain size, but I wanted to get at the raw point data, instead. The Kinect team has kindly provided the Reconstruction.ExportVolumeBlock() method for just this purpose – it’s intended to populate an array with voxel data which you can interpolate trilinearly to extract model/mesh information (erk) – but I haven’t yet been able to have it return anything but an array of zeroes. So the code is currently asking the Kinect Fusion runtime to calculate a mesh from the reconstruction volume and we then use the vertices from that mesh as points to display.

The typical Kinect Fusion sample makes use of a quite different technique: it generates a shaded view of the mesh from a particular viewpoint – the underlying API casts rays into the reconstruction volume – which is very quick. Calculating a mesh and extracting its vertices is slower – especially when we get into the millions of points – so we have to accept the responsiveness is going to be different.

And that’s mostly OK: we simply drop incoming frames when we’re already processing one, as otherwise we build up a queue of unprocessed frames leading to a significant lag between the movement of the sensor and the population of the reconstruction volume. But this also means that there’s a much bigger risk of the Kinect Fusion runtime not being able to track the movement – as the time between processed frames is larger and so are the differences – at which point we receive “tracking failures”.