Categories

Evolving the Human Machine Interface Part III

The concept of Presence in Virtual Reality (VR) has been gaining popularity over the past year, particularly within the gaming community. With consumer VR devices in development from Oculus, Sony, and more than likely Microsoft, Presence has become the metric by which we evaluate all VR experiences. But Presence is difficult to describe to someone who has never tried VR. “It’s like I was actually there. It made me feel like what I was seeing was actually happening to me, as though I was experiencing it for real,” is how one colleague described the experience.

Presence in VR triggers the same physical and emotional responses one would normally associate with a real world situation, it is the wonderfully magical experience of VR. But how is Presence achieved? While many research studies have provided a variety of subjective descriptions for Presence, there seem to be 3 common variables that affect tele or virtual Presence most:

1. Sensory input: at minimum the ability to display spatial awareness

2. Control: the ability to modify one’s view and interact with the environment

Because the nature of VR isolates the user from real world visual input, if the device’s sensory input and control are inadequate or missing then the effect of Presence fails, the results of which are oftentimes met will ill side effects: You feel sick to your stomach!

Sensory Input

For those who have tried VR, at some point or another you’ve felt queasy. That point which the experience turns from wonder to whoa, has been an unfortunate side effect throughout the development of VR. As Michael Abrash presented at Valve’s Steam Dev days, the hurdles needed to overcome VR sickness and achieve Presence are within reach. In the video below, Michael expertly summarizes the technical hurdles to achieve a believable sense of Presence in VR.

“What VR Could, Should, and Almost Certainly Will Be within Two Years” Steam Dev Days 2014

Michael Abrash, Valve Softwrae

Control

To achieve a minimum level of Presence, head tracking is used to display the VR image to match the users own head position and orientation. While this helps to create the sense of spatial awareness within VR, it still doesn’t make someone’s experience truly “Present.” To do that, we need to add the representation of ourselves, either through an avatar or our own body image. Viewing our physical presence in VR, known as body awareness, creates an instant sense of scale and helps to ground the user within the experience. In the video sample below, Untold Games is creating body awareness through avatar control.

“Loading Human, true body awareness” Untold Games 2014

Without body awareness, VR can feel more like an out-of-body experience. Everything “looks” real and the user has spatial awareness, but the user’s body and movements are not reflected therefore the user does not feel actually present. Combining body awareness with VR’s spatial awareness creates a strong bond between the user and the experience.

Cognition

The third perimeter of Presence is us. The feeling of Presence in VR is directly influenced by our personal ability to process and react to environmental changes in the real world. It’s likely that many of us will not have the same reactions to the experiences within VR. If you get sick riding in cars easily, then VR motion will give you the same sensation. If you’re afraid of heights, fire, spiders, etc. you’re going to have the same strong reactions and feelings in VR. Our individual real life experience influences our perception and reactions to VR. This can lead to some interesting situations, in particular with gaming. For example one player may be relatively unaffected by a situation or challenge, while another may be strongly affected.

Obviously the conditions of Presence are perceptual only. In most cases we’re not at the same physical risk in virtual environments as we would be in real life. But our own cognition coupled with VR’s ability to create Presence is why VR is such a popular field for everything from gaming and entertainment to therapy and rehabilitation.

Once we start to overcome these technical hurdles and provide a basic level of Presence, we next need to understand what it will ultimately enable. What does Presence provide for us in an experience other than merely perceiving the experience as real-like? We’ll explore that idea in the next segment, and try to understand where Presence will have the most impact.

How the World Is Finally Ready For Virtual and Augmented Reality

By Mike Nichols, VP, Content and Applications at SoftKinetic

The year is 1979 and Richard Bolt, a student at MIT, demonstrates a program that enables the control of a graphic interface by combining both speech and gesture recognition. As the video of his thesis below demonstrates, Richard points at a projected screen image and issues a variety of verbal commands like “put that there”, to control the placement of images within a graphical interface in what he calls a “natural user modality”.

“Put-That-There”: Voice and Gesture at the Graphics Interface

Richard A. Bolt, Architecture Machine Group

Massachusetts Institute of Technology – under contract with the Cybernetics Technology Division of the Defense Advanced Research Projects Agency, 1979.

What Bolt demonstrated in 1979 was the first natural user interface. A simple pointing gesture combined with a verbal command, while an innate task in human communication, was and still is difficult for machines to understand correctly. It would take another 30 years for a consumer product to appear that might just fulfill that vision.

A new direction

In the years following Richard’s research, technology would advance to offer another choice to improve the Human Machine Interface (HMI). By the mid 80’s the mouse, a pointing device for 2D screen navigation, had evolved to provide an accurate, cost effective, and convenient method for navigating a graphical interface. Popularized by Apple’s Lisa and Macintosh computers, and supported by the largest software developer Microsoft, the mouse would become the primary input for computer navigation over the next 20 years.

“The Macintosh uses an experimental pointing device called a ‘mouse’. There is no evidence that people want to use these things.”

San Francisco Examiner, John C. Dvorak – image provided by…

In 2007, technology advancements helped Apple once again popularize an equally controversial device, the iPhone. With its touch sensitive screen and gesture recognition, the touch interface in all its forms has now become the dominant form of HMI.

The rebirth of natural gesture

Although seemingly dormant throughout the 80’s and 90’s, research continued to refine a variety of methods for depth and gesture recognition. In 2003 Sony released the Eye Toy for use with the PlayStation2. The Eye Toy enabled Augmented Reality (AR) experiences and could track simple body motions. Then in 2005 Nintendo premiered a new console, the Wii, which used infrared in combination with handheld controllers to detect hand motions for video games. The Wii controllers, with their improved precision over Sony’s Eye Toy, proved wildly successful and set the stage for the next evolution in natural gesture.

In 2009 Microsoft announced the Kinect for Xbox 360, with its ability to read human motions to control our games and media user interface (UI), without the aid of physical controllers.

What Richard Bolt had demonstrated some 30+ years prior was finally within grasp. Since the premier of Kinect we’ve seen more progress in the development of computer vision and recognition technologies than in the previous 35 years combined. Products like the Asus Xtion, Creative Senz3D, and Leap Motion have inspired an energetic global community of developers to create countless experiences across a broad spectrum of use cases.

The future’s so bright

To this day, Richard’s research speaks to the core of what natural gesture technology aims to achieve, that “natural user modality”. While advances in HMI have continued to iterate and improve over time, the medium for our visual interaction has remained relatively intact: the screen. Navigation of our modern UI has been forced to work within the limits of the 2D screen. With the emergence of AR and VR, our traditional forms of HMI do not provide the same accessible input as the mouse and touch interfaces of the past. Our HMI must evolve to allow users the ability to interact to the scene and not the screen.

Next, we’ll explore how sensors, not controllers, will provide the “natural user modality” that will propel AR and VR to become more pervasive than mobile is today. The answer, it seems, may be right in front of us…we just need to reach out and grab it.

Folks, here is my talk from ISTAS 2013 in Toronto on 29 July 2013. Following an intro to augmented reality I review a collection of AR experiences and test them against Lex Ardez’ “3 laws of augmented reality design”.

1. Augmentation must emerge from the real world and/or relate to it

2. Augmentation must not distract from reality, but make you more aware of it

These rules are very logical and simple and yet most AR implementations fail to meet these laws. When it comes to defining an AR experience, the #3 is the most important: do not implement AR only for a cool factor; if a traditional interaction technique (on computers, mobile devices etc.) does a good job – do not try to do recreate it with AR. Look for the specific experiences that can only be achieved with AR, even if it’s very niche.

The talk is based on my experience in the last 6 years building AR applications and reviewing practically every AR app that was published in that time frame. I have seen many applications that have a wow factor that lasts for 2 minutes – but most applications are not used more than once. Designers and producers need to look at it in a very different way than traditional user experiences. It’s important to understand that AR is about digitizing our interacting with the physical world. It should not be viewed as a traditional form of Human Machine Interaction (HMI). But rather be thought of as Human-World-Interaction, which requires a new thinking, new rules, and new experiences.

I believe that in the next few years we’ll see AR becoming an integral part of any aspect of our work and life. And it will completely change the way we interact with people, places and things. Of course traditional approaches (PCs, mobile touch) will still be best for certain things – and AR shouldn’t be forced for things that it’s not intended for – but it’ll create new categories of things that we can’t even imagine. AR has the power to enable us to do things and feel things we couldn’t otherwise. It can help us learn, and master skills instantly. AR Technology has reached a “good enough” level; it is up to designers to bring it to the masses in a meaningful way.

When using the iPhone (or similar mobile device) for an augmented reality experience, the interaction is pretty straight forward – hold your hands up with your iPhone pointing to your target. Want more options? Touch the screen. Had enough – tuck it back in your pocket.

How do you interact with augmented reality (AR) when it’s constantly in your field of view – overlaid on your glasses?

Interacting with Augmented reality

Are we going to operate knobs on the glasses?

Pete touched a stud on his spex, pulled down a glowing menu,and adjusted his visual take on the outside world. (Taklamakan, short story by Bruce Sterling)

-probably not beyond pressing the “on” button…

Are we going to be surrounded by rings?

Vodpod videos no longer available.

-Ringo looks cool, but we’re looking for a new metaphor. The traditional keyboard (albeit arched and projected on the ground) might not the most intuitive way.

Tinmith?

-Visionary, but touching thumbs instead of using a mouse? (oh, and can I lose the backpack?)

Eye gaze tracking

– that’s pretty good for point and click. But what about more complex gestures?

(by the way, this could be great for Tennis)

Interactive clothing?

-Absolutely. This will probably be available to the public as an intuitive interaction with AR displays in 5-10 years

So is there anything that could be used for an intuitive interaction with augmented reality Today?

Are there any contemporary options?

Logitech Glove Controller (P5)

The P5 was an inexpensive, good looking glove-like, that tracks finger movement – so why did it flop?

Probably because of accuracy (or lack thereof) and that fact it requires an external reference (IR base similar to the Wii.) Others may contend it never found a really good use. You can still try it for yourself for under $75!

Accelaglove

The Accelaglove has the right price (<$500) and the technology is promising – but currently focusing on translating hand movements of sign language.

Peregrine Power Glove

The Peregrine Power Glove was a huge promise at E3 2009. It was also my biggest disappointment: Using your thumb to touch your fingers to feed the computer with various commands…on a good day it could replace the keyboard when playing a real-time strategy game.

There is a bunch of other gloves that may be good at certain tasks – but not suited for intuitive-affordable AR.

Introducing the Zerkin Glove

It’s a low-cost, motion and position capturing, data glove for 3D interaction with virtual objects in augmented reality (AR) environments.

Watch the latest iteration of the prototype in this video.

It won’t replace computers and mouses as 3D designer tool anytime soon, but for scenarios where there is no access to mouse or PC it could offer a truly intuitive interaction – at an affordable price. One glaring example is the following: architect and client on location discussing interior design plan. This scenario is about conveying impressions and enabling rough changes (what if scenarios) – which do not require high accuracy. There are other interfaces probably more suited for VR. But when it comes to AR – this is as good as it gets.

The dust over GDC 2009 has settled a while ago and finally I got to reflect on the AR experience at the show. Guess which headline would summarize it best:

a) augmented reality was the talk of the show

b) the expo floor was swarming with AR demos

c) AR games snatched lucrative game awards

d) none of the above

A friend in San Francisco wearing retro AR goggles

Unfortunately (d) is the right answer.

But – and it’s a big ‘but’ – the ignition spark was there. The seed was planted. The first shot of the revolution was fired.

(OK, maybe the last metaphor went too far.)

Here are 5 triggers I identified that ignited the spark:

1) Blair

The first GDC augmented reality talk – ever: Blair MacIntyre covered the latest and greatest about mobile augmented reality in front of a packed room of game developers. Awesome.

2) Demos

Was it the first time AR demos (and more demos) were presented at a major Game Developer Conference ?

Not sure – but it certainly was my first…

3) Mentions in talks

Was Blair’s AR talk an isolated case?

Perhaps as a main topic it was. However, for the first time, I heard significant mentions of AR in multiple other talks. Check them out:

A talk about pervasive gaming. I liked the title: “Beyond the Screen: Principles of Pervasive Game” by the folks from Pervasive Games. The played with the concept in which the whole world is the playground. These games, are founded in the belief that doing things for real is pleasurable. Games that harness reality as a source book have interesting dynamics. Everything in reality matters to the game, the game play emerges from coincidence, and real and artificial blur.

Camera Based Gaming: The Next Generation by Diarmid Campbell attracted the attention of a room packed with game developers. He talked about Sony’s upcoming camera games for the PlayStation 3 such as Eye Pet. Armed with the EyeToy camera, these games will have the power to extract amusing gestures from players. Not quite AR – but sure smells like it.

Stretching Beyond entertainment – AR made a surprise appearance in a high profile panel discussion feturing some of the gods of the gaming industry: (from right) Ed Fries , Lorne Lanning,Bing Gordon,Will Wright, and Peter Molyneux.
The best quote award went to Ed Fries for saying: “We need to take game mechanics and apply them to the real world”.

4) Meet ups

Dinners, lunches, business meetups, brainstorming sessions – haven’t had that many meetings with AR enthusiasts since ISMAR 2008…

Take the example of AR advertising for the Willington Zoo tried by Satchi and Satchi (2007).

This is a pretty complex approach which requires publishing printed material, creating a database for the additional AR info and querying database before presenting

In place Augmented reality is a vision based method for extracting content all encapsulated in the image itself.

The process includes: Using our visual language to encode the content in the image. The visualization is done as in a normal AR application.

The secret sauce of this method is the visual language used to encoding the AR information.

There are multiple benefits to this approach: the content is human readable and it avoids the need for an AR database, and for any user maintenance of the system. This approach also works with no network communication.

A disadvantage is that there is a limit of the amount of info which can be encoded in an image. Nate describes this as a trade off.

I am also asking myself, as a distributor of AR applications, what if I want to change AR data on the fly? Nate suggests that in such a case a hybrid approach could be used: some of the info is extracted from the encoded image. Additional image coding could point to dynamic material from the network (e.g. updated weather or episodic content).

~~~

Second presenter is Kohei Tanaka which will unveils An Information Layout Method for an Optical See-through Head Mounted Display Focusing on the Viewability

The idea in short is to place virtual information on the AR screen in a way that always maintains a viewable contrast.

The amusing example demonstrates a case where this approach ca help dramatically: you are having tea with a friend, wearing your favorite see-through AR HMD. An alert generated the AR system tries to warn me about a train I need to catch, but due to the bright alert on top of a bright background – I miss the alert, and as a consequence miss the train…

Kohei’s approach, makes sure that the alert is displayed in a part of the image where the contrast is good enough to make me aware of the alert. Next time, I will not miss the train…

Question: Is in it annoying for users that the images on screen constantly change position…?

Kohei responds that it requires further research…

~~~

Last in this session is Stephen Peterson from Linkoping University with a talk about Label Segregation by Remapping Stereoscopic Depth in Far-Field Augmented Reality.

The domain: Air Traffic control. A profession that requires to maintain multiple sources of information and combine them into a single context cognitively.

Can Augmented Reality help?

The main challenge is labeling: how do you avoid clutter of labels that could quickly confuse the Air traffic controller?

The conclusion: Remapping stereoscopic depth of overlapping labels in far field AR improves the performance. In other words – when you need to display numerous labels on a screen that might overlap with each other – use the depth of the view and display the labels in different 3d layers.