The Reality of Augmented Reality

In 2009 augmented reality technology (AR) became mainstream. Though it has been under development for over four decades, in the past year it was prominently featured in major ad campaigns and was on the cover of Esquire. Concurrently, Layar, Wikitude, and a number of AR applications were released for mobile phones. The future potential of AR has now captured the imagination of both the public and the press. The hype surrounding this technology is similar to the excitement over virtual reality during the 1990’s and 3D online communities, namely Second Life, during this past decade. Unfortunately, in the mind of consumers, neither of these technologies lived up to the hype. Due to a lack of understanding, virtual reality and 3D online communities were unfairly and prematurely dismissed as failures by many. AR is in danger of suffering the same fate. Geoff Northcott described the situation well in his post Augmented Reality, Second Life, and the Trough of Disillusionment.

In an effort to help manage expectations regarding AR technology, I will briefly describe what works today while clarifying what we can expect in the future.

Augmented Reality Glasses are not Viable in the Near Term

When covering AR, a number of technology pundits have assumed that within the next few years, we can expect head mounted displays or augmented reality “glasses” to become the best display for AR applications. Without a doubt, it would be groundbreaking if a high quality AR display could be built into the form factor of sunglasses. Unfortunately, a lightweight, wide field of view, daylight readable, head mounted display (HMD) at mass market prices is not something we can expect to see in the next five years. I have either bought or used most head mounted displays sold commercially since 1992, and I have seen great strides over the years in HMD resolution, brightness, and power usage. To illustrate where the technology now stands, here are the best that I have used.

-Fakespace Labs “Wide 5” : Easily, the most immersive head mounted display I have tried, it can be modified for video see through augmented reality applications.

-eMagin Z800 3Dvisor: It has a relatively small form factor, it’s very affordable, uses OLED’s, and it can be adapted for video see through augmented reality applications. The field of view is limited.

No one has yet delivered a wide field of view display in a small package that approaches the footprint of sunglasses as the public expects. With significant engineering investments, innovative approaches from companies such as Lumus, ORA, and Digilens may show promise in solving this problem. In addition to optics issues, eye strain and other head mounted display ergonomics problems must be dealt with. For a detailed overview of the current and future state of head mounted display technology, I suggest the following:

-The Coming Generation of Head-worn Displays (HWDs): Will the Future come to us through new Eyes? Kevin and Jannick Rolland-Thompson, James P. McGuire, and Ozan Cakmakci, Optical Society of America 2009 Annual Meeting, San Jose, CADownload PDF

-The Past, Present, and Future of Head Mounted Display Designs” by Jannick Rolland and Ozan Cakmakci (College of Optics and Photonics: CREOL & FPCE, University of Central Florida) Download PDF

Ultimately, there will be no single “ultimate” AR display. Instead, the type of display used for AR will be determined by the needs of the application. Current and future developers will have a range of options including smart phone screens, handheld tablets, desktop monitors, micro projectors, and AR glasses.

GPS and Compass is not Enough

There are a number of mobile augmented reality applications, namely Layar, Wikitude, and the Yelp Monocle, which use GPS and compass data to overlay graphic information on a live video view of the real world. Though these applications are novel and interesting, the data provided by a mobile device’s GPS and compass is simply not precise enough to deliver a quality AR user experience. Information overlays usually appear to wobble or bounce around the video view. Consumers will rarely use these applications after the novelty factor wears off. At best, these applications provide an alternative viewing mode for data that should be first presented on a 2D map or in a list. Here’s a well written piece from New Scientist commenting on the current state of mobile augmented reality applications: Augmented Reality Gets off to a Wobbly Start

Sensor Fusion is the Answer

The best approach to AR tracking and registration involves hybrid tracking and sensor fusion techniques which use computer vision technology in conjunction with GPS and compass data. This paper provides deeper insight.

Mobile augmented reality applications using a variety of sensor fusion techniques have been prototyped over the years, and we will hopefully see applications leveraging them on the market soon.

AR Hardware Platforms for Today

Here are five AR hardware platforms that developers can utilize today for creating augmented reality applications.

-Webcams

In the past year, there has been an explosion of ad campaigns which deliver webcam based AR experiences through an internet site or a downloadable application. Most of these utilized the FLARToolkit. I’ve used it to develop a number of AR web demos, available via the links below.

-3G Smartphones

3G smartphones have enormous potential for AR applications. Phones from manufacturers including Apple, HTC, and Nokia are equipped with a camera, GPS, compass, accelerometers, and 3D graphics capabilities. These devices have enormous potential for AR applications. Unfortunately developers are currently more limited by the restrictions of manufacturer API’s than by the hardware itself. For example, developers cannot currently release iPhone applications which directly perform vision based tracking on the camera’s videostream. Computer vision based AR applications that are currently in the app store rely on analyzing still frames which limits the AR experience. In 2010, hopefully these frustrating API barriers will be overcome, and we can see vision based AR applications proliferate on mobile devices.

Microsoft’s heavily publicized Project Natal should provide a set of tools for authoring AR experiences for the Xbox 360. Hopefully developers will be able to embrace AR and ingeniously incorporate it into their game designs.

-Tablets

Researchers have extensively used tablets to prototype AR applications proving the viability of the tablet as an AR platform. Most of these demos were created using the last generation of tablets from HP and Toshiba. Those these devices provided great functionality, they were not widely accepted by consumers. Here’s a 2004 AR paper from George Klein et. al. which utilized the HP TC1000 tablet. With its sleek design and form factor, Apple’s forthcoming tablet may have greater success. Apple’s tablet will reportedly be equipped with a camera which could make it another excellent AR platform. The process of porting an iPhone application to the Apple tablet is expected to be a straightforward process which could encourage the release of AR applications with versions for both devices.

Focus on Utility and Fun, not Novelty

After the “wow factor” wears off, there’s a risk that consumers may begin to dismiss AR as a gimmick rather than a technology that provides real value. Developers can prevent this situation from happening by using AR to design applications that focus on utility and fun rather than the novelty of seeing an object pop out of a marker.

Both of these applications provide unique value helping consumers understand that AR is a technology that is useful beyond just being an attention grabbing feature.

The Road Ahead

For those who have been a part of the AR community since its infancy and those who are just entering the field, these are very exciting times. After decades of research, mass market AR applications are finally viable and can be delivered on a variety of platforms. If developers, investors, analysts, and consumers can develop a real understanding of what AR can and cannot do, the future of AR technology is bright, and I look forward to its evolution as an innovative and inspiring medium for creativity, communication, and commerce.

7 Comments

Have you seen Microvision’s Eyewear? (http://www.microvision.com/wearable_displays/mobile.html) It seems promising. See-through, visible during daytime, 3D, etc. Apparently the military is trying it out. I don’t know how it would project dark areas, and I don’t know how the lasers would continue to shine on your retina if you shifted your eyes away from center, but anyway, it seems very interesting.

I am very familiar with Microvision. They have been around a long time having spun out of work at the University of Washington HITLAB in the 1990’s. Twelve years ago I worked for one of their board members, and I co-authored a white paper on applications for a fully realized, color, wide field of view Microvision retinal display. Since that time, I’ve used various red monochrome displays they’ve either prototyped or sold. About eight years ago, they released and sold the “Nomad” display. It was not widely adopted, and I do not think it is available today. When I used one, it delivered an excellent sense of imagery being overlaid on the real world. The field of view was relatively small though. I do recall feeling a slight tingling sensation in my eye. The military first looked at their display technology in the late 1990’s. Daylight readability was the key selling factor. Now, there are other technologies which are daylight readable. Microvision seems to have shifted their focus to micro video projectors. I get the sense that they really need a successful product on the market that takes advantage of their intellectual property. The biggest challenge to head mounted display (HMD) design is creating a wide field of display in a compact physical package. No one has solved this problem yet, though there are ideas that hold promise which could advance the technology forward given funding and time. There’s also the eyestrain issue which may never be fully resolved. On the other hand, stereoscopic 3D movies can also cause eyestrain discomfort. Regardless, they are now a major commercial success though they were initially introduced several decades ago. Head mounted display development is on a slow path, but I am always hopeful for research, technical, and market breakthroughs.

I noticed that Vuzix does not warrant a mention in your post. I am looking at the possible application of their AR 920 product in the field of architectural visualization. I did not get to see their product demo at CES so I have no idea what the world looks like through their glasses but it’s the only commercial product slated for release in the undetermined near future.

In my firm, we create BIM models using Autodesk Revit Architecture. Recently I started geolocating these models and displaying them in Google Earth. (Ok so I’m late to the game. Big deal, people still think that’s cool.)
Logically the next step is to be able to display a geolocated model in the field. Granted, once the user is in the field, the problem is tracking the user. GPS and compass are out, so what about photogrammetry ? I was wondering if there is the possibility of solving this through software rather than hardware.

Different existing commercial software packages allow the user to interact with imagery and 3D in ways that could (to me) be combined to allow the accurate positioning of a 3D building in an Augmented reality setting.

Autodesk Imagemodeler allows the user to “pin” known points of a 3D model to known points in a photo, for easy camera matching. The newly released VideoTrace Beta allows the user to “trace” a 3D model from one or more calibrated video segments. And of course existing camera matching software such as Autodesk Matchmover allows the user to extract a camera path from a video segment.

Well used Photogrammetry software like EOS Photomodeler uses coded targets for fine calibration of images for 3D extraction

Meanwhile LinceoVR and AR-Media use coded targets for AR display.

So riddle me this. Why can’t some brilliant software engineer combine these elements in the following way.

1. Head out to the project site with a geolocated survey.

2. Locate known coordinates in the field with coded targets.

3. Put on AR headset.

4. Walk a calibration route which allows the user to view a given number of targets at a time while recording the output of the AR glasses.

5. Export to a calibration software which uses elements of camera matching, photogrammetry, and a few parameters such as know distances between physical points to situate the user within the site.

6. Once this is done, slap the 3D model “onto” the site and “pin” the know points in the 3D site to the known points in the real world.

7. Keep updating the position of the user via the video feed from the headset using the coded targets, image recognition of the site (cruise missile style and existing x,y,z head tracking technology.

If the problem is weighing down the headset with active systems, take those systems out of the headset and put them into the world. The headset could be connected to a teeny tiny laptop carried by the user (MacBook Air or similar). The coded targets can be readily printed to whatever size is needed with any plotter and the AEC industry can have a reaaaaaaallllly cool new way of showing the client what they’re getting. Also a reaaaaaaallllly cool way of coordinating disciplines.

Granted it’s not a solution for people who want to walk around the street with it, but for my use where I’m displaying a building on a known, limited physical site it would be priceless.

I want to be able to see the building, walk around the building and walk into the building. I wholeheartedly believe that a software solution can get us there with existing technology.

Thank you for your response. I’ve tried out Vuzix AR products in the past. They are functional, but give the impression that you are viewing the world through a television set. For some niche applications, they may be adequate.

The system you described is quite feasible. The reason it can be done is that you are doing it in a “prepared” location where markers have been placed, and calibration data has been gathered. In fact, there have been a number of research demos of the years that accomplish what you are trying to do. The visual effects industry uses roughly similar techniques to allow directors and actors to see computer generated objects and effects while they are shooting on set.

As an alternative to a headmounted display, you could also consider using augmented reality binoculars or an LCD panel mounted on a tripod. With these displays, you would not need to track position after initial calibration. You would only need to track orientation. Using an LCD with a camera mounted on front would allow a group of clients to view the scene at once.

I agree that it will continue to be hard to meet the exceptions for AR as long as the majority of applications are still quite gimmicky. I think the future of AR is in more devices. The more types of devices that can implement AR functionality, the more practical it will be for necessary, everyday activities.