How does augmented reality work?

So, we’re all now experts in what AR is and how it came about. Today in AR Week is where we get into the nitty gritty a little further. What we’re going to take a look at is how augmented reality actually works and, from that, you should get a pretty good idea of why we are where we are with AR applications and what it’s going to take to get this branch of technology up to that next level.

To make sense of the process, we’ll break it down into the necessary components which make AR possible. So, from the viewer to the world and beyond, this is how AR works.

Window on the world

The first thing you’re going to need for augmented reality is some reality. If you haven’t got that then you’ve got nothing to work with and all you could possibly create is an entirely virtual world. So, effectively what you need is a window on the world of some sort. Whether that’s a remote view through a video screen - as in the case of a television - or, more likely, the location with which you wish to augment. You need that backdrop, the canvas of reality to add information to which you otherwise would not be able to appreciate. If there’s no reality, there’s no AR. Simple enough so far.

AR display device

Now that you’ve got your background environment, you need a way of displaying the augmentations to your brain. The idea of AR, of course, is to supply information of your environment that’s otherwise undetectable to your naked senses. So, for it to work, you need a method of displaying those annotations that wouldn’t ordinarily exist. Most often, what we’re talking about is effectively a frame through which to look at the world. The classic examples are a mobile phone, head mounted display (glasses or a visor), heads up display (in the case of a fighter pilot’s windscreen) or even a tablet in this day and age.

There are other possibilities; such as a projector to provide on overlay directly onto the surface you wish to augment, or even the most common one of all, a television or monitor which supplies quite neatly a view of reality via a remote camera and also the power to display graphic information depending upon what’s added by the production team in their broadcast studios or your computer hardware.

Beyond the visual world, you could also have devices for other senses such as a glove or earpiece which could supply more information to both the sense of touch or hearing. Whatever the case, there needs to be something in between your organic senses to allow them to translate an unseen stimulus from the environment into something you can detect.

There are other possibilities; such as a projector to provide on overlay directly onto the surface you wish to augment, or even the most common one of all, a television or monitor which supplies quite neatly a view of reality via a remote camera and also the power to display graphic information depending upon what’s added by the production team in their broadcast studios or your computer hardware.

Beyond the visual world, you could also have devices for other senses such as a glove or earpiece which could supply more information to both the sense of touch or hearing. Whatever the case, there needs to be something in between your organic senses to allow them to translate an unseen stimulus from the environment into something you can detect.

Data

It’s one thing to have your reality and a way to display the desired augmentations, but a crucial piece of the puzzle is having that extra information to add in the first place. What we’re most usually talking about is a database of some sort and there’s two main forms that this is likely to take.

The first is that it could be a local source of information. In other words a database stored somewhere on the AR user. More likely in this day and age is the second option, and that’s using that great source of freely accessible data that is the Internet. We could be talking about something very specific such as the information on Wikipedia or it could be a more complex application pulling in data from a variety of places such as Facebook, Flickr and Twitter to, perhaps, offer face recognition based on photographs followed by whereabouts, telephone numbers, status updates and likes and dislikes - all from information shared on these social networks.

Of course, the data doesn’t have to be free if it’s on the Internet. It could be something stored in the cloud but still secured and private and only accessible to certain people. That way it’s still information that can be accessed from anywhere without having to walk around with a hard disk strapped to the user’s back. However, if you wish to wear that backpack or keep all the data on your mobile phone, then that's okay too.

The first is that it could be a local source of information. In other words a database stored somewhere on the AR user. More likely in this day and age is the second option, and that’s using that great source of freely accessible data that is the Internet. We could be talking about something very specific such as the information on Wikipedia or it could be a more complex application pulling in data from a variety of places such as Facebook, Flickr and Twitter to, perhaps, offer face recognition based on photographs followed by whereabouts, telephone numbers, status updates and likes and dislikes - all from information shared on these social networks.

Of course, the data doesn’t have to be free if it’s on the Internet. It could be something stored in the cloud but still secured and private and only accessible to certain people. That way it’s still information that can be accessed from anywhere without having to walk around with a hard disk strapped to the user’s back. However, if you wish to wear that backpack or keep all the data on your mobile phone, then that's okay too.

Connection

Whether or not the user has to physically carry around the data or whether it’s found on the Internet instead, the fact remains that one requires a live connection to that information for AR to work. In the case of having it all locally with a backpack strapped on, it’s simply a case of using high speed cables to wire a direct connection between the computer and your window on the world, that is your AR display device.

In many ways, this is actually the best solution because the connection is both quick and reliable. The limitation is that it’s rather impractical on the mass user level, so heading into the future the connection that we’re talking about is a connection - usually wireless - to the Internet. That’s either going to be over a local network to a router, via Wi-Fi if in a more controlled environment, but more probably if AR is really going to kick off in a big way, it’s going to be need to be done over something further reaching and more ubiquitous in the shape of the mobile broadband network; be that over HSDPA, LTE, Wi-Max or whatever the best available technology is at the time.

Naturally, the downside here is that the user is relying heavily on coverage and speed of connection for a consistent AR experience as well as the servers at the other end being in good shape too. Thankfully, all of this is getting better as time goes on and, although it might seem like the weak link at the moment, it’s not, in fact, the area that’s holding AR back, but more on that later in AR Week.

In many ways, this is actually the best solution because the connection is both quick and reliable. The limitation is that it’s rather impractical on the mass user level, so heading into the future the connection that we’re talking about is a connection - usually wireless - to the Internet. That’s either going to be over a local network to a router, via Wi-Fi if in a more controlled environment, but more probably if AR is really going to kick off in a big way, it’s going to be need to be done over something further reaching and more ubiquitous in the shape of the mobile broadband network; be that over HSDPA, LTE, Wi-Max or whatever the best available technology is at the time.

Naturally, the downside here is that the user is relying heavily on coverage and speed of connection for a consistent AR experience as well as the servers at the other end being in good shape too. Thankfully, all of this is getting better as time goes on and, although it might seem like the weak link at the moment, it’s not, in fact, the area that’s holding AR back, but more on that later in AR Week.

Application

The hardware is in place but there’s a lot of reality out there and bags of information on the Internet for your AR display device to connect to. What you now need is some software to recognise what’s coming into your device from the outside world, call up the required information based on that and then instruct your mobile phone, your HMD or whatever it is to display and overlay the data correctly - and all in the blink of an eye. Not an easy task.

For most of the top labs studying AR in the technology institutions of the world, this is exactly what they’re working on. Potentially this application could be stored on the Internet, on your display device or a dedicated box somewhere on your person. Generally speaking it’s going to require some decent graphics processing hardware to work properly for the initial recognition part of the process as well as having the ability to generate the augmentations for the user to see. What’s more, all of this has to be done in real time for it to work.

One way around the recognition part of the equation is by using GPS to track the user's position instead of having to rely on the software correctly identifying your surroundings, based on a view of information through the lens on your cameraphone or whatever it may be. The issue there, though, is that you need the positional information to be dead on, as well as a data set that includes maps, although the latter isn’t such a problem with the likes of Google Maps and Street View available to all.

Perhaps one of the hardest tasks of all is to track the virtual objects and render them correctly in 3D so that the user can move through his or her environment while still receiving accurate annotations of that which they see. To do all that without so much as having the overlay a millimetre off, in real time and with the correct perspective, is one of the toughest challenges of all.

Finally, on a more conceptual than technical level, the developer also needs to make sure that any information that they display to the user is done so in a meaningful way that’s relevant to the task. We can only take in so much at once and it’s no good bombarding someone with more than their brain can process. So, while it might be possible for an application to pull in 3D details of absolutely everything in a scene, it’s important to have something that’s both selective and unobtrusive as well. This is augmented reality and both the augmentations and the reality are just as important as one another.

One way around the recognition part of the equation is by using GPS to track the user's position instead of having to rely on the software correctly identifying your surroundings, based on a view of information through the lens on your cameraphone or whatever it may be. The issue there, though, is that you need the positional information to be dead on, as well as a data set that includes maps, although the latter isn’t such a problem with the likes of Google Maps and Street View available to all.

Perhaps one of the hardest tasks of all is to track the virtual objects and render them correctly in 3D so that the user can move through his or her environment while still receiving accurate annotations of that which they see. To do all that without so much as having the overlay a millimetre off, in real time and with the correct perspective, is one of the toughest challenges of all.

Finally, on a more conceptual than technical level, the developer also needs to make sure that any information that they display to the user is done so in a meaningful way that’s relevant to the task. We can only take in so much at once and it’s no good bombarding someone with more than their brain can process. So, while it might be possible for an application to pull in 3D details of absolutely everything in a scene, it’s important to have something that’s both selective and unobtrusive as well. This is augmented reality and both the augmentations and the reality are just as important as one another.

Conclusion

So, those are the elements that make AR possible and if you manage to complete that chain from one end to the other, then you’ll have a system that works and is hopefully useful at the same time. From our brief look at each step so far, it’s very easy to see that none of this is straight forward and we’ll take a closer look at where most of the challenges lie later in AR Week. Stay tuned.