The Strong, Weak, Open, and Transparent

So, it’s been a couple of years since I’ve felt compelled to post to this blog, but I think it’s high time for an update. I’m just going to quickly touch on a few of things I’m excited about, having just attended Augmented Reality Event 2011.

Things in the Augmented Reality world have progressed rapidly, if not as rapidly as I might once have imagined they would. In one of my first posts, I closed with an idea about streaming one’s first-person POV to a giant Microsoft Photosynth system in the cloud. The Bing Maps team, under Blaise Aguera y Arcas and Avi Bar-Zeev, is doing exactly that. With Read / Write World, Microsoft is developing what I think will be the foundation of what Blaise called “Strong AR.” This is in contrast with the “weak,” strictly sensor-based AR applications that we’re seeing on mobile devices at the moment.

To clarify, there are two paradigms of current AR usage:

One of these two is local vision-based AR using marker or texture tracking to position virtual objects relative to a camera’s perspective. This is done by calculating the homography that describes the relationship between the captured image of the tracked pattern, and the original pattern. From this, one generates translation and orientation matrices for the placement of virtual content in the scene. This is Strong AR, but on a local scale and without a connection to a coordinate system linked to the world as a whole.

The other is the AR found in most mobile apps like Layar and Wikitude. The information visualized through these apps is placed using a combination of geolocation and orientation derived from the sensors found in smartphones. These sensors are the components of a MARG array: triaxial magnetometric, accelerometric, and gyroscopic sensors. By knowing a user’s position and orientation, which are together referred to as a user’s pose, one nominally knows what a user is looking at, and inserts content into the scene. The problem with this method is one of resolution and accuracy, and this is what Blaise was referring to as “weak.” This method, however, provides an easy means by which to place data out in the broader world, if not with precise registration.

The future of Strong AR is the fusion of these two paradigms, and this is what Read / Write World is being developed for. The underlying language of the system is called RML, or Reality Markup Language. Already, if photographic data for a location exists in the system, and one uploads a new image with metadata placing it nearby, the Read / Write World can return the homography matrix. According to Blaise’s statements during his Augmented Reality Event keynote, pose relative to the existing media is determined with accuracy down to the centimeter. And the new image becomes part of the database, so users will constantly be refining and updating the system’s knowledge of the world.

Anyhow, I think Read / Write World has the potential to be the foundation for everything that I, and so many others, have envisioned. That’s on the infrastructure side.

So what about the hardware?

In the last couple of years, mobile devices have really grown up, and are getting to, or have reached, the point where they pack enough processing power to be the core of a real Strong AR system. Qualcomm has positioned itself as one of the most important entities in Augmented Reality, providing an AR SDK optimized for their hardware, on which most Android and Windows Mobile platforms are based. In a surprising move, at ARE, they announced that they are bringing their AR SDK to the iOS platform as well.

With peripheral sensor support and video output, we’ve got almost everything we need to be able to connect a pair of see-through display glasses (more on those in a little bit) to one of these mobile devices for AR experience. But the best that those connections can provide is a “weak” AR experience. Why? Because the connectors don’t support external cameras. True, there are devices like the Looxcie, but the resolution and framerate are paltry, and are a limitation of the Bluetooth connection. On top of that, the integrated cameras in mobile devices are wired at a low-level to the graphics cores of their processors and dump the video feed directly into the framebuffers, facilitating the use of optimized processing methods, such as Qualcomm’s. What we need is the inclusion of digital video input in the device connectors, providing the same sort of low-level access to the video subsystems of the devices. This is absolutely vital to being able to use visual information from the camera(s) on a pair of glasses for their intended purpose of real-time pose estimation.

So… glasses…

At ARE I got to try out a Vuzix prototype that finally delivers what I’d hoped to see with the AV920 Wrap. The new device is called the STAR 1200, for See-Through Augmented Reality. It looks a little funny in the picture, but don’t worry about the frame. The optical engine is removable and the final unit’s frame will probably look substantially different. It provides stereo 852×480 displays projected into optically see-through lenses and, let me tell you, it looks good. It is a great first step towards something suitable for mass adoption. The limited field of view coverage means that it won’t provide a truly immersive experience for gaming and the like, but again, it is a great first step. Now before I get your hopes up, this device will be priced for the professional and research markets, like the Wrap 920AR. Vuzix isn’t a big enough company to bust this market open on its own. But once apps are developed and the market grows, we’ll see this technology reaching consumer-accessible price points. I’m going to refrain from predictions of timeframe this time around, but I think that things are very much on track. Also, keep in mind that this is a different technology than the Raptyr, the prototype that Vuzix showed at CES this year. The Raptyr’s displays utilize holographic waveguides, while the STAR 1200 is built around more traditional optics. I did get to see another Vuzix prototype technology in private, and can’t say anything about it, but it is very promising.

One last development that has me very excited is Google’s new Open Android Accessory Development Kit. It’s based on the Arduino platform, making it instantly accessible to hundreds of thousands, if not millions, of existing experimenters, developers, and hardware hackers, including myself. This opens up all kinds of possibilities for custom human interface devices.

Flickr

Why?

I'm writing this blog because it seems to me that few people in the media really seem to grasp what's coming. Any reviews of HMDs that I see, for instance, talk about watching movies on the train and other mundane applications. Also, Augmented Reality isn't all about holding a fiducial marker in front of your webcam to generate a composite video on your screen. AR (in my opinion) is about augmenting YOUR reality from YOUR perspective. It is about wearable and ubiquitous computing. It is about getting out from in front of your screen and putting your data out in the spatial world in which you live. This is where I'll be musing on these themes. Enjoy :-D
Augmentation by Noah Zerkin is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.Based on a work at augmentation.wordpress.com.Permissions beyond the scope of this license may be available at noazark@gmail.com.