Web-enabled Kinect

There are Kinect hacks out there for robot vision, 3D scanners, and even pseudo-LIDAR setups. Until now, one limiting factor to these builds is the requirement for a full-blown computer on the device to deal with the depth maps and do all the necessary processing and computation. This doesn’t seem like much of a problem since [wizgrav] published Intrael, an HTTP interface for the Kinect.

[Eleftherios] caught up to [wizgrav] at his local hackerspace where he did a short tutorial on Intrael. [wizgrav]’s project provides each frame from the Kinect over HTTP wrapped up in JSON arrays. Everything a Kinect outputs aside from sound is now easily available over the Internet.

The project is meant to put computer vision outside the realm of desktops and robotic laptops and into the web. [wizgrav] has a few ideas on what his project can be used for, such as smart security cameras and all kinds of interactive surfaces.

After the break, check out the Intrael primer [wizgrav] demonstrated (it’s Greek to us, but there are subtitles), and a few demos of what Intrael ‘sees.’

The computer needed for the heavy lifting should be no more than an ARM Cortex A8 @ 1Ghz. The client server approach also has the advantage of decoupling the box that handles the kinect and the one handling the output. It’s been tested and works great over wifi. This comes very handy since the kinect can track up to 10m away, the usb cable length is no longer an issue

Not quite, I think matt was right when he said that the summary is misleading. The images are not base64 encoded. They’re encoded as JPEGs and presented as a stream through a tag using a technique called MJPEG over http. You can read about it here

The MJPEG stream and the one that delivers the data from the blob tracking are served from separate paths on the server. I like what you did though, I opted for lossy transmission for practical reasons. I also experimented with websockets for the JSON data delivery but got fed up with the protocol changing all the time. Intrael(optionally) supports Server Sent Events which is basically a one-way websocket, maybe that would be of interest to you as well.

Yeah, a vorbis stream would achieve better compression ratio even though it would be lossy as well. Another reason MJPEG was chosen was the ease of implementation compared to regular video streaming. You just have to stream regular JPEGs with a text boundary between them. It’s much lighter on resources than normal video compression and the results are still usable as crop material. But if you want to further analyze the pixel data in the browser the best solution would be a lossless format like PNG which I also tested but the file sizes got pretty big. I’m still thinking about it though.