The second reason for video game immersion is our mind and its incredible capacity for abstraction. This is the point games have in common with other art forms. Movies, music, literature, poetry, painting, all rely on the human mind to achieve immersion. But because we human beings have such a huge capacity of abstraction and imagination, most of these media still achieve immersion while offering vastly incomplete experiences. And that’s not a bad thing. A movie is very close to a full experience: it offers detailed visual and auditory stimuli. On the other hand, listening to a concert on an iPod offers nothing but audio and is still capable of moving us deeply. More impressive yet is abstract painting. Even with loose correlation to reality, it still makes us think and react. But perhaps the most impressive example is in literature. A romance offers nothing other than letters. Although it is not as interpretation heavy as an abstract painting, the feelings a book offers are completely created by the brain. The flat pages have no pictures, audio, smell or taste. But we are still capable of imagining all that from the otherwise meaningless letters and feel as if we were there, in a battle for Britain or in the middle of a rather funny version of the days preceding Armageddon (two of my favorite books by the way).

This is the immersion aspect that moves game technology the farthest. Graphics evolve at every new console or video card iteration in the search for immersion. But if you stop to think about it, isn’t this inconsistent with what I just said? If we can get immersion even from black letters on white paper, why struggle so much with graphics? Well, there’s a whole visual experience related to video games. It would be like asking Salvador Dali why he added so many details to his paintings. It is not just about activity immersion. Everything with a visual component can aim for a visual experience and for beauty. The same goes for all other senses. There are different types of immersion: you might love Assassin’s Creed II but still look at an individual screenshot and admire the beauty and level of detail in the 3D models and textures.

But that did not satisfy me either and I found the best answer from Richie Nieto, who helped me with my questions in the IGDA forums. Like he said, immersion depends on the suspension of disbelief. Which means our brain must fool us into believing the alternate reality the game offers (isn’t that the same thing we do with reality itself? Subject for another topic). The catch is that this depends on our experiences and our expectations. When a gamer plays a very abstract game, say Lumines, he’s taken to a weird world of falling blocks and intriguing sounds. We can, of course, admire the beauty of the graphics in combination with sound effects caused by gameplay. But in order for Lumines to be immersive, all it has to do is be consistent with itself. There’s no other world like it, our experiences and expectations are based on the game itself. Okami on the other hand also has a very unique art style, obviously non realistic. And while the lack of graphic realism does not stop us from getting involved, we have other aspects to consider. If gravity does not behave as expected, the player will notice. If the painted wolf’s head disappears behind a mountain due to a collision detection problem, it will bother us a bit. The experience is not ruined, but these details break the suspension of disbelief for a brief moment. It is even worse for a title like Metal Gear Solid 4. In this case, reality itself (well, at least as perceived by our senses) is the standard. That makes it so much harder to achieve immersion. Graphics matter a lot, as do sound, physics, movement, interactions.

So in short, the second reason for immersion is: the game must live up to its expectations and provide a consistent alternate reality. This alternate reality must not clash with itself or with the reality the player created in his mind, based on the game and on previous experiences.

Immersion is one of the greatest goals every game tries to achieve. It is more obvious in what I call character based games. When you play Gears of War, you feel you are in Marcus Fenix’s shoes. When you play MGS4 you feel the drama of the dying hero as if it were your own. In Modern Warfare 2, every drop of blood on the screen makes you worry and get cover.

But it is also true for impersonal games; that is, the ones whose focus is not on characters. It is easier to notice it when I replace the word immersion for involvement. When playing Lumines or Tower Bloxx for example, you feel involved in the game: frustrated after a mistake, excited when a new level is reached, defiant when the score of a friend is beaten. Note impersonal and casual games are not the same thing at all. Chess is impersonal as are, heroes aside, most RTSs: replace one Zealot for another, there’s no difference at all. All sports games are like that too, with the exception of modes like EA’s Be A Pro.

Now, there’s a multitude of factors that contribute to how immersive a game is. I recently mentioned one talking about Demon’s Souls: challenge. A challenged player is an involved player. There’s also the connection between player and game at the fundamental design level: some people like puzzles, some like shooters. I will not get in the merits of each one, but if you like a type of game, you will be more into it. Style also plays a big role: when graphics and sound suit the game and your mood, they also improve immersion. In fact, I would say immersion is the reason behind most graphical updates in the games industry. From polygon count and texture size to shaders, in the search for immersion graphics chips are always evolving to provide better and more artistic or realistic visuals.

From here we get to the other facet of immersion I want to talk about: controls. No matter what game you are playing, connecting to it requires controlling it without trouble. You only get into Tetris when you learn how to move and rotate the pieces the way you want, you only appreciate Geometry Wars after getting used to moving the ship with the analog stick, you only feel like Sam Fisher when pulling all his moves gets easy, you only enjoy Fifa 10 when passing, dribbling and shooting becomes second nature.

Over the years, games became more and more complex. And with game complexity came complex controls. The Atari had 1 button. The NES had 2. The Genesis had 3. In the current generation, both PS3 and 360 have 4 face buttons, 2 shoulder buttons, 2 shoulder triggers and 2 clickable analog sticks. And I am not counting dpads, select and start buttons.

Most gamers are used to it. Hell, controllers could have more buttons and that wouldn’t be a problem, not to me. But with the last generation of consoles, we saw a big move in the opposite direction coming from Nintendo. The Wii has less buttons and makes up for it trying to detect something everybody knows how to do: move and point. When I first heard about it, I was very interested, both as a gamer and as a robotic perception researcher. To me, that meant games would become even more immersive, shooters would feel even more realistic.

It is curious that while it is true the Wii controllers increased immersion, that change did not affect most hardcore games. Nintendo correctly (from a business perspective) focused on using the more approachable controls to bring a new crowd to the video games era. And it worked very well for them.

Nintendo’s approach was so right that others have been following it ever since. After the success of the Wii, many game platforms started exploring new and more natural input methods. Touch screens and accelerometers became very popular.

But we will soon reach a new apex. Something I personally have been waiting for since I started studying computer vision. And Microsoft is the one about to pull it off: no controls. No buttons at all. If you haven’t heard of Project Natal before, go check it out. It is awesome.

The idea of a vision system in games is not new, the PS2 had the EyeToy. But there were many technical limitations: from sensor capability (one still eye won’t give proper perception of depth for example) to processing power, as robust computer vision algorithms require a whole lot of processing. Project Natal solves these problems in a very interesting way: a single camera is used for “texture” detection. And instead of stereo vision, they achieve 3D perception with a depth sensor. As for the processing power, Project Natal’s device features a custom processor, which is certainly there to reduce the load on the 360 hardware.

Like Nintendo did, Project Natal’s first efforts will probably aim the casual market and bring more gamers to the table. But that does not change the fact that immersion in video games will take a big leap. Imagine playing Lumines by grabbing the blocks and rotating them with your hands. Or simply using your empty hands to select your playlist. Wouldn’t it be cool? Heck, in Minority Report Tom Cruise needed cool glowing gloves to do what we are about to get with our bare hands. The future is here, my friends.

Anyway, as the number of buttons get close to the limit, after all we only have 10 fingers, new input methods are here to stay. I don’t have a Wii. But I will need a bigger living room when Project Natal becomes available.