A blog for developers programming with Autodesk platforms, particularly AutoCAD and Forge. With a special focus on AR/VR and IoT.

March 20, 2013

Kinect Fusion inside AutoCAD

OK, here goes: my first (public) attempt at integrating the brand new Kinect Fusion functionality – made available this week in v1.7 of Microsoft’s Kinect for Windows SDK – into AutoCAD. There are still a few quirks, so I dare say I’ll be posting an update in due course.

As mentioned in the last post, I’ve been working on this for some time but can now show it publicly, as the required SDK capabilities have now been published. As part of this effort, I’ve gone ahead and made sure the other Kinect samples I’ve written for AutoCAD work with this version of the SDK: all can be found here.

Much of the work was clearly to integrate the appropriate Kinect API calls into an AutoCAD-resident jig, much in the way we’ve seen before when displaying/importing a single depth frame. Kinect Fusion introduces the idea of a reconstruction volume that gets gradually populated with data streamed in from a Kinect sensor, building up an underlying mesh that represents the 3D model.

AutoCAD is OK with meshes to a certain size, but I wanted to get at the raw point data, instead. The Kinect team has kindly provided the Reconstruction.ExportVolumeBlock() method for just this purpose – it’s intended to populate an array with voxel data which you can interpolate trilinearly to extract model/mesh information (erk) – but I haven’t yet been able to have it return anything but an array of zeroes. So the code is currently asking the Kinect Fusion runtime to calculate a mesh from the reconstruction volume and we then use the vertices from that mesh as points to display.

The typical Kinect Fusion sample makes use of a quite different technique: it generates a shaded view of the mesh from a particular viewpoint – the underlying API casts rays into the reconstruction volume – which is very quick. Calculating a mesh and extracting its vertices is slower – especially when we get into the millions of points – so we have to accept the responsiveness is going to be different.

And that’s mostly OK: we simply drop incoming frames when we’re already processing one, as otherwise we build up a queue of unprocessed frames leading to a significant lag between the movement of the sensor and the population of the reconstruction volume. But this also means that there’s a much bigger risk of the Kinect Fusion runtime not being able to track the movement – as the time between processed frames is larger and so are the differences – at which point we receive “tracking failures”.

Which ultimately means the user has to move the sensor really slowly to keep everything “on track”. Here’s a video that should give you a sense of the problem, as I attempt to capture a vase and an orchid on my dining table:

[I did edit the video to cut out some waiting as the points are fully imported at the end: the resulting point cloud has around 1.5 million points, so the current process of writing them to an ASCII file, converting this to LAS and then indexing the LAS to PCG is far too slow… this is something I am planning to streamline, incidentally.]

Here’s a normal photo of the scene, to give you a sense of what I’m trying to capture:

During the video, you’ll notice a number of tracking failures. When you get tracking failures you have four main options:

Return the sensor to the position at which the tracking was last successful (to continue mapping).

Cancel the capture by hitting escape.

Complete the capture by clicking the mouse.

Let the errors accumulate: when the count hits 100 consecutive errors (this is coded in the sample – you could disable this or change the threshold) the reconstruction will get reset.

I hope that at some point I’ll be able to tweak the processing to make it sufficiently efficient to eliminate the problem of tracking being lost between frames. I also hope to be able to integrate colour into the point cloud: this isn’t something that’s directly provided by Kinect Fusion, but I expect there’s some way to get there.

Here’s the C# code for this implementation (you should download the complete samples – a repeat of the link from earlier in the post, in case you missed it – to see the code it relies upon):

using Autodesk.AutoCAD.EditorInput;

using Autodesk.AutoCAD.Geometry;

using Autodesk.AutoCAD.Runtime;

using Microsoft.Kinect;

using Microsoft.Kinect.Toolkit.Fusion;

using System;

using System.Collections.Generic;

using System.Collections.ObjectModel;

using System.IO;

using System.Threading;

using System.Windows.Threading;

#pragma warning disable 1591

namespace KinectSamples

{

publicstaticclassUtils

{

publicstaticPoint3dCollection

Point3dFromVertCollection(

ReadOnlyCollection<Vector3> vecs

)

{

var pts = newPoint3dCollection();

foreach (var vec in vecs)

{

pts.Add(newPoint3d(vec.X, vec.Z, -vec.Y));

}

return pts;

}

publicstaticList<ColoredPoint3d>

ColoredPoint3FromVertCollection(

ReadOnlyCollection<Vector3> vecs

)

{

var pts = newList<ColoredPoint3d>();

foreach (var vec in vecs)

{

pts.Add(

newColoredPoint3d() { X = vec.X, Y = vec.Z, Z = -vec.Y }

);

}

return pts;

}

}

// A struct containing depth image pixels and frame timestamp

internalstructDepthData

{

publicDepthImagePixel[] DepthImagePixels;

publiclong FrameTimestamp;

}

publicclassKinectFusionJig : KinectPointCloudJig

{

// Constants

privateconstint MaxTrackingErrors = 100;

privateconstint ImageWidth = 640;

privateconstint ImageHeight = 480;

privateconstReconstructionProcessor ProcessorType =

ReconstructionProcessor.Amp;

privateconstint DeviceToUse = -1;

privateconstbool AutoResetReconstructionWhenLost = true;

privateconstint ResetOnTimeStampSkippedMillisecondsGPU = 3000;

privateconstint ResetOnTimeStampSkippedMillisecondsCPU = 6000;

// Member variables

privateEditor _ed;

privateSynchronizationContext _ctxt;

privatedouble _roomWidth;

privatedouble _roomLength;

privatedouble _roomHeight;

privateint _lowResStep;

privateint _voxelsPerMeter;

privateFusionFloatImageFrame _depthFloatBuffer;

privateMatrix4 _worldToCameraTransform;

privateMatrix4 _defaultWorldToVolumeTransform;

privateReconstruction _volume;

privateint _processedFrameCount;

privatelong _lastFrameTimestamp = 0;

privatebool _lastTrackingAttemptSucceeded;

privateint _trackingErrors;

privateint _frameDataLength;

privatebool _processing;

privatebool _translateResetPoseByMinDepthThreshold = true;

privatefloat _minDepthClip =

FusionDepthProcessor.DefaultMinimumDepth;

privatefloat _maxDepthClip =

FusionDepthProcessor.DefaultMaximumDepth;

// Constructor

public KinectFusionJig(

Editor ed, SynchronizationContext ctxt,

double width, double length, double height, int vpm, int step

)

{

_ed = ed;

_ctxt = ctxt;

_roomWidth = width;

_roomLength = length;

_roomHeight = height;

_voxelsPerMeter = vpm;

_lowResStep = step;

_processing = false;

_lastTrackingAttemptSucceeded = true;

_vecs = newList<ColoredPoint3d>();

}

privatevoid PostToAutoCAD(SendOrPostCallback cb)

{

_ctxt.Post(cb, null);

System.Windows.Forms.Application.DoEvents();

}

publicoverridebool StartSensor()

{

if (_kinect != null)

{

_kinect.DepthStream.Enable(

DepthImageFormat.Resolution640x480Fps30

);

_frameDataLength = _kinect.DepthStream.FramePixelDataLength;

try

{

// Allocate a volume

var volParam =

newReconstructionParameters(

_voxelsPerMeter,

(int)(_voxelsPerMeter * _roomWidth),

(int)(_voxelsPerMeter * _roomHeight),

(int)(_voxelsPerMeter * _roomLength)

);

_worldToCameraTransform = Matrix4.Identity;

_volume =

Reconstruction.FusionCreateReconstruction(

volParam, ProcessorType, DeviceToUse,

_worldToCameraTransform

);

_defaultWorldToVolumeTransform =

_volume.GetCurrentWorldToVolumeTransform();

ResetReconstruction();

}

catch (InvalidOperationException ex)

{

_ed.WriteMessage("Invalid operation: " + ex.Message);

returnfalse;

}

catch (DllNotFoundException ex)

{

_ed.WriteMessage("DLL not found: " + ex.Message);

returnfalse;

}

catch (ArgumentException ex)

{

_ed.WriteMessage("Invalid argument: " + ex.Message);

returnfalse;

}

catch (OutOfMemoryException ex)

{

_ed.WriteMessage("Out of memory: " + ex.Message);

returnfalse;

}

_depthFloatBuffer =

newFusionFloatImageFrame(ImageWidth, ImageHeight);

_kinect.Start();

_kinect.ElevationAngle = 0;

returntrue;

}

_ed.WriteMessage(

"\nUnable to start Kinect sensor - " +

"are you sure it's plugged in?"

);

returnfalse;

}

publicvoid OnDepthFrameReady(

object sender, DepthImageFrameReadyEventArgs e

)

{

if (!_processing && !_finished)

{

using (var depthFrame = e.OpenDepthImageFrame())

{

if (depthFrame != null)

{

DepthData depthData = newDepthData();

// Save frame timestamp

depthData.FrameTimestamp = depthFrame.Timestamp;

// Create local depth pixels buffer

depthData.DepthImagePixels =

newDepthImagePixel[depthFrame.PixelDataLength];

// Copy depth pixels to local buffer

depthFrame.CopyDepthImagePixelDataTo(

depthData.DepthImagePixels

);

// Process on a background thread

Dispatcher.CurrentDispatcher.BeginInvoke(

DispatcherPriority.Background,

(Action<DepthData>)((d) => ProcessDepthData(d)),

depthData

);

// Stop other processing from happening until the

// background processing of this frame has completed

_processing = true;

}

}

}

}

// Process the depth input

privatevoid ProcessDepthData(DepthData depthData)

{

try

{

CheckResetTimeStamp(depthData.FrameTimestamp);

// Convert the depth image frame to depth float image frame

FusionDepthProcessor.DepthToDepthFloatFrame(

depthData.DepthImagePixels,

ImageWidth,

ImageHeight,

_depthFloatBuffer,

FusionDepthProcessor.DefaultMinimumDepth,

FusionDepthProcessor.DefaultMaximumDepth,

false

);

bool trackingSucceeded =

_volume.ProcessFrame(

_depthFloatBuffer,

FusionDepthProcessor.DefaultAlignIterationCount,

FusionDepthProcessor.DefaultIntegrationWeight,

_volume.GetCurrentWorldToCameraTransform()

);

if (!trackingSucceeded)

{

_trackingErrors++;

PostToAutoCAD(

a =>

{

_ed.WriteMessage(

"\nTracking failure. Keep calm and carry on."

);

if (AutoResetReconstructionWhenLost)

{

_ed.WriteMessage(

" ({0}/{1})",

_trackingErrors, MaxTrackingErrors

);

}

else

{

_ed.WriteMessage(" {0}", _trackingErrors);

}

}

);

}

else

{

if (!_lastTrackingAttemptSucceeded)

{

PostToAutoCAD(

a => _ed.WriteMessage("\nWe're back on track!")

);

}

// Set the camera pose and reset tracking errors

_worldToCameraTransform =

_volume.GetCurrentWorldToCameraTransform();

_trackingErrors = 0;

}

_lastTrackingAttemptSucceeded = trackingSucceeded;

if (

AutoResetReconstructionWhenLost &&

!trackingSucceeded &&

_trackingErrors >= MaxTrackingErrors

)

{

PostToAutoCAD(

a =>

{

_ed.WriteMessage(

"\nReached error threshold: automatically resetting."

);

_vecs.Clear();

}

);

Console.Beep();

ResetReconstruction();

}

_points = GetPointCloud(true);

++_processedFrameCount;

}

catch (InvalidOperationException ex)

{

PostToAutoCAD(

a =>

{

_ed.WriteMessage(

"\nInvalid operation: {0}", ex.Message

);

}

);

}

// We can now let other processing happen

_processing = false;

}

// Check if the gap between 2 frames has reached reset time

// threshold. If yes, reset the reconstruction

privatevoid CheckResetTimeStamp(long frameTimestamp)

{

if (0 != _lastFrameTimestamp)

{

long timeThreshold =

(ReconstructionProcessor.Amp == ProcessorType) ?

ResetOnTimeStampSkippedMillisecondsGPU :

ResetOnTimeStampSkippedMillisecondsCPU;

// Calculate skipped milliseconds between 2 frames

long skippedMilliseconds =

Math.Abs(frameTimestamp - _lastFrameTimestamp);

if (skippedMilliseconds >= timeThreshold)

{

PostToAutoCAD(

a => _ed.WriteMessage("\nResetting reconstruction.")

);

ResetReconstruction();

}

}

// Set timestamp of last frame

_lastFrameTimestamp = frameTimestamp;

}

// Reset the reconstruction to initial value

privatevoid ResetReconstruction()

{

// Reset tracking error counter

_trackingErrors = 0;

// Set the world-view transform to identity, so the world

// origin is the initial camera location.

_worldToCameraTransform = Matrix4.Identity;

if (_volume != null)

{

// Translate the reconstruction volume location away from

// the world origin by an amount equal to the minimum depth

// threshold. This ensures that some depth signal falls

// inside the volume. If set false, the default world origin

// is set to the center of the front face of the volume,

// which has the effect of locating the volume directly in

// front of the initial camera position with the +Z axis

// into the volume along the initial camera direction of

// view.

if (_translateResetPoseByMinDepthThreshold)

{

Matrix4 worldToVolumeTransform =

_defaultWorldToVolumeTransform;

// Translate the volume in the Z axis by the

// minDepthThreshold distance

float minDist =

(_minDepthClip < _maxDepthClip) ?

_minDepthClip :

_maxDepthClip;

worldToVolumeTransform.M43 -= minDist * _voxelsPerMeter;

_volume.ResetReconstruction(

_worldToCameraTransform, worldToVolumeTransform

);

}

else

{

_volume.ResetReconstruction(_worldToCameraTransform);

}

}

}

protectedoverrideSamplerStatus SamplerData()

{

if (_vecs.Count > 0)

{

_points.Clear();

foreach (var vec in _vecs)

{

_points.Add(

newPoint3d(vec.X, vec.Y, vec.Z)

);

}

}

ForceMessage();

returnSamplerStatus.OK;

}

publicoverridevoid AttachHandlers()

{

// Attach the event handlers

if (_kinect != null)

{

_kinect.DepthFrameReady +=

newEventHandler<DepthImageFrameReadyEventArgs>(

OnDepthFrameReady

);

}

}

publicoverridevoid RemoveHandlers()

{

// Detach the event handlers

if (_kinect != null)

{

_kinect.DepthFrameReady -=

newEventHandler<DepthImageFrameReadyEventArgs>(

OnDepthFrameReady

);

}

}

publicMesh GetMesh()

{

return _volume.CalculateMesh(1);

}

// Get a point cloud from the vertices of a mesh

// (would be better to access the volume info directly)

publicPoint3dCollection GetPointCloud(bool lowRes = false)

{

using (var m = _volume.CalculateMesh(lowRes ? _lowResStep : 1))

{

returnUtils.Point3dFromVertCollection(

m.GetVertices()

);

}

}

publicList<ColoredPoint3d> GetColoredPointCloud(

bool lowRes = false

)

{

using (var m = _volume.CalculateMesh(lowRes ? _lowResStep : 1))

{

returnUtils.ColoredPoint3FromVertCollection(

m.GetVertices()

);

}

}

// Get a point cloud from the volume directly

// (does not currently work)

publicPoint3dCollection GetPointCloud2(bool lowRes = false)

{

var step = lowRes ? _lowResStep : 1;

var res = _voxelsPerMeter / step;

var destResX = (int)(_roomWidth * res);

var destResY = (int)(_roomHeight * res);

var destResZ = (int)(_roomLength * res);

var destRes = destResX * destResY * destResZ;

var voxels = newshort[destRes];

// This should return an array of voxels:

// these are currently all 0

_volume.ExportVolumeBlock(

0, 0, 0, destResX, destResY, destResZ, step, voxels

);

var pitch = destResX;

var slice = destResY * pitch;

var fac = step / 100.0;

var pts = newPoint3dCollection();

for (int x=0; x < destResX; x++)

{

for (int y=0; y < destResY; y++)

{

for (int z=0; z < destResZ; z++)

{

var vox = voxels[z * slice + y * pitch + x];

if (vox > 0)//!= 0x80 && vox == 0)

{

pts.Add(newPoint3d(x * fac, z * fac, -y * fac));

}

}

}

}

return pts;

}

protectedoverridevoid ExportPointCloud(

List<ColoredPoint3d> vecs, string filename

)

{

if (vecs.Count > 0)

{

using (StreamWriter sw = newStreamWriter(filename))

{

// For each pixel, write a line to the text file:

// X, Y, Z, R, G, B

foreach (ColoredPoint3d pt in vecs)

{

sw.WriteLine(

"{0}, {1}, {2}, {3}, {4}, {5}",

pt.X, pt.Y, pt.Z, pt.R, pt.G, pt.B

);

}

}

}

}

protectedvoid ExportPointCloud(

Point3dCollection pts, string filename

)

{

if (pts.Count > 0)

{

using (StreamWriter sw = newStreamWriter(filename))

{

// For each pixel, write a line to the text file:

// X, Y, Z, R, G, B

foreach (Point3d pt in pts)

{

sw.WriteLine("{0},{1},{2},0,0,0", pt.X, pt.Y, pt.Z);

}

}

}

}

}

publicclassKinectFusionCommands

{

privateconstint RoomWidth = 3;

privateconstint RoomHeight = 2;

privateconstint RoomLength = 3;

privateconstint VoxelsPerMeter = 256;

privateconstint LowResStep = 4;

privatedouble _roomWidth = RoomWidth;

privatedouble _roomLength = RoomLength;

privatedouble _roomHeight = RoomHeight;

privateint _voxelsPerMeter = VoxelsPerMeter;

privateint _lowResStep = LowResStep;

[CommandMethod("ADNPLUGINS", "KINFUS", CommandFlags.Modal)]

publicvoid ImportFromKinectFusion()

{

var doc =

Autodesk.AutoCAD.ApplicationServices.

Application.DocumentManager.MdiActiveDocument;

var db = doc.Database;

var ed = doc.Editor;

// Ask the user for double information

var pdo = newPromptDoubleOptions("\nEnter width of volume");

pdo.AllowNegative = false;

pdo.AllowZero = false;

pdo.DefaultValue = _roomWidth;

pdo.UseDefaultValue = true;

var pdr = ed.GetDouble(pdo);

if (pdr.Status != PromptStatus.OK)

return;

_roomWidth = pdr.Value;

pdo.Message = "\nEnter length of volume";

pdo.DefaultValue = _roomLength;

pdr = ed.GetDouble(pdo);

if (pdr.Status != PromptStatus.OK)

return;

_roomLength = pdr.Value;

pdo.Message = "\nEnter height of volume";

pdo.DefaultValue = _roomHeight;

pdr = ed.GetDouble(pdo);

if (pdr.Status != PromptStatus.OK)

return;

_roomHeight = pdr.Value;

// Ask the user for integer information

var pio =

newPromptIntegerOptions("\nEnter voxels per meter");

pio.AllowNegative = false;

pio.AllowZero = false;

pio.DefaultValue = _voxelsPerMeter;

pio.UseDefaultValue = true;

var pir = ed.GetInteger(pio);

if (pir.Status != PromptStatus.OK)

return;

_voxelsPerMeter = pir.Value;

pio.Message = "\nLow resolution sampling";

pio.DefaultValue = _lowResStep;

pir = ed.GetInteger(pio);

if (pir.Status != PromptStatus.OK)

return;

_lowResStep = pir.Value;

// Create a form to set the sync context properly

using (var f1 = newForm1())

{

var ctxt = SynchronizationContext.Current;

if (ctxt == null)

{

throw

new System.Exception(

"Current sync context is null."

);

}

// Create our jig

var kj =

newKinectFusionJig(

ed, ctxt,

_roomWidth, _roomLength, _roomHeight,

_voxelsPerMeter, _lowResStep

);

if (!kj.StartSensor())

{

kj.StopSensor();

return;

}

var pr = ed.Drag(kj);

if (pr.Status != PromptStatus.OK && !kj.Finished)

{

kj.StopSensor();

return;

}

kj.PauseSensor();

try

{

ed.WriteMessage(

"\nCapture complete: examining points...\n"

);

System.Windows.Forms.Application.DoEvents();

var pts = kj.GetColoredPointCloud();

ed.WriteMessage(

"Extracted mesh data: {0} vertices.\n",

pts.Count

);

System.Windows.Forms.Application.DoEvents();

kj.WriteAndImportPointCloud(doc, pts);

}

catch (System.Exception ex)

{

ed.WriteMessage("\nException: {0}", ex.Message);

}

kj.StopSensor();

}

}

}

}

Despite some of the issues – that relate mainly to the fact we’re trying to extract 3D data in real-time from the Kinect Fusion runtime – hopefully you can see that this is very interesting technology. If you have a Kinect for Windows sensor, you can also use the Kinect Explorer sample from the KfW SDK to create a mesh (an .OBJ or .STL) that you can then import into the 3D software of your choice. Very cool.