Animation in the CAVE

Josephine Anstey and Dave Pape describe the CAVE Automatic Virtual Environment, a virtual reality display device which uses 3-D animation. In other words, it is an entertainment prototype that can best be described as a Star Trek HoloDeck precursor.

"You walk into a ten by ten foot room, put on a special pair of glasses, and take hold of a 3-D mouse. As the program starts, the walls disappear. Now you are surrounded by the cobbles, stonework and red tiles of an Italian Renaissance city. Down the street you can see a cathedral. Pushing a button on the mouse moves you forward. Then you see an anomaly - a sleek, electric blue column. As you get closer, the column whirs and splits - a disjointed little figure unfolds. The figure gestures to you, then takes off. It looks back and beckons, `Come on!'

"You follow this strange guide to a dark stairway, winding up inside the cathedral. At the top, a low doorway takes you outside. The city stretches beneath you. Transparent pathways curl around the dome and lead back to the ground. You see the little guide far below. He is dancing wildly as a stream of flying letters and books sail past him and whip inside a building. What is in there .....?"

VR Without the Head Gear

This is a description of an experience in the CAVE virtual reality theatre. Virtual Reality is the art and science of using computers to create three dimensional worlds that users can be immersed in, explore as they please, and interact with in real time.

The CAVE, a recursive acronym for CAVE Automatic Virtual Environment, is a virtual reality display device, but not the kind of head-mounted display normally associated with VR. Instead, it's more like a prototype for Star Trek's HoloDeck; a room that people can enter, with stereoscopic computer images projected on the walls and floor. The computer continually updates and redraws the display as users move through the environment. One of the potentials of the CAVE is the creation of animated 3-D worlds and characters that a user can interact with, in effect making the user part of a story.

The CAVE's History

The CAVE was created at the University of Illinois at Chicago's Electronic Visualization Laboratory (UIC's EVL). EVL is a state of the art research lab for interactive computer graphics and brings together students from UIC's Schools of Engineering and Art and Design. It was founded in the early 1970s by Dan Sandin, art professor and creator of the Sandin Image Processor, a device well known in the video art community, and Tom DeFanti, engineering professor and author of GRASS, an early computer animation system. The Lab's work has always been a mixture of art, entertainment, engineering, and science. In the '70s, EVL staged public performances of interactive electronic art and provided the computer hardware and software used to create the original computer graphics in Star Wars. Later work included developing graphics hardware that formed one of the first home computer systems, investigating 3-D fractal imagery, and using visualization for scientific research with the National Center for Supercomputing Applications.

In 1991, DeFanti and Sandin decided to use their experience with video and interactive computer graphics to create a new approach to the growing field of virtual reality. Traditional VR systems were head-mounted and were usually a pair of small video displays attached to a helmet or mechanical boom. Most such displays were low-resolution, encumbering, and isolated the user. EVL's new device, the CAVE, used video projection screens to create a VR display that users entered, rather than wore. The CAVE display was high-resolution, only required the users to wear lightweight shutter glasses, and could be shared by whole groups of people at once. It was implemented by EVL students Carolina Cruz-Neira, Greg Dawe, Sumit Das, and others, and was first shown at the 1992 SIGGRAPH Conference in Chicago. The full system was completed barely in time for the conference and many of the demonstrated applications hadn't been seen in the CAVE itself before show time.

The CAVE is a 10 foot by 10 foot cube; three walls are rear projection screens, and the floor is projected onto from above. High-end Silicon Graphics computers, such as an Onyx2 Infinite Reality, generate the 3-D images and simulate the dynamics of the virtual world. Another SGI machine, connected to loudspeakers in the four corners of the CAVE, creates the sounds of the environment. The ImmersaDesk is a newer, smaller-scale version of the CAVE and resembles a drafting table-style display, rather than an entire room. Since 1992, over 50 CAVEs, ImmersaDesks, and similar devices have been installed in universities, corporate labs, and a few museums.

A Myriad of Applications

The difference between virtual reality and normal computer graphics is that with virtual reality, the user is immersed in the computer-generated environment. The user is surrounded by images and sound. The images are in stereoscopic 3-D, rather than flat on the screen, and the world is displayed in a first-person perspective, from his viewpoint, rather than a third-person viewpoint common to most other forms of image creation. To complete the immersion in the virtual world, VR is interactive, meaning the user can have (some) control over what happens.

Whether it's molecular biology, cosmology, architecture and design, education, entertainment or the arts, VR can be applied to any problem that can benefit from an immersive, three-dimensional, interactive solution. General Motors has started using CAVEs to evaluate the design of new car interiors before having to build physical prototypes. Old Dominion University is using an ImmersaDesk to view computer simulations of the Chesapeake Bay ecosystem. At the National Center for Supercomputing Applications (NCSA), Donna Cox used the CAVE program Virtual Director to create animation for the IMAX film Cosmic Voyage. A group at EVL has built a virtual island where children can tend a virtual garden and learn about environmental concepts. EVL also participates fully in the world of electronic art. Dan Sandin organized the opening show for the first CAVE installed in a museum of Art and Technology, the Ars Electronica Center in Linz, Austria, which featured projects by EVL faculty and students.

The Thing

We are currently working on a project, The Thing Growing, whose focus is the construction of the "Thing," a virtual, interactive, animated character in the CAVE. The goal of the project is to create a story in which the user takes a leading role and is engaged at an emotional level with the Thing.

The Thing looks translucent. The triangular shapes forming its' head, appendages and body do not seem to join up. It changes colors as it speaks and according to its' moods. It is alternately bullying and loving. It has no specific gender and it's goal is to make the user dance with it, which it takes as a sign of love and obedience.

To animate in the CAVE, we use tools familiar to any computer animator. For example, the models for The Thing Growing are being made in Softimage and the textures are being made in Photoshop. These models are then imported into the CAVE. In VR, the computer has to redraw the scene in about one sixtieth of a second in order to keep the frame rate at 30 frames per second. Remember it has to draw a different view for each eye! Therefore, even with an Onyx2 these models must be far simpler than those of computer animation for film and video, where you can spend minutes or hours rendering a single frame. The gain, and we think it's an exciting one, is being able to interact, in real time, with a virtual character and world.

Animating Virtual Reality

Virtual reality applications can use a wide variety of methods for animating. Flipbooks, keyframing, motion capture, and procedural (computer programmed) animation are all used. "The Multi Mega Book in the CAVE," which was described at the beginning of this article, uses a flipbook of 3-D models to walk a wire-framed Judas out of da Vinci's painting of the Last Supper. In The Thing Growing, rocks come alive and chase the user. When a rock gets close enough, it rears up and swallows the user. In this case there are only four simple models and the CAVE morphs between them to produce the rock's growing and grabbing action.

We commonly use keyframe animation to move objects in a CAVE application. For example, to animate the flying letters referred to in the description of the "Multi Mega Book," keyframes were set to determine the path of the letters through the city. As the letters move along this path, a simple behavior routine makes them also spin and orbit each other.

The CAVE uses a tracking system to get information on the user's head and hand position in order for the computer to compute the perspective from a user-centered point of view. Tracking is done with electromagnetic systems such as Ascension's Flock of Birds; sensors are attached to the stereo glasses and to the 3-D mouse.

We use this same system to record motion-tracked animation. Our first experiments with this were for the "Multi Mega Book." Our collaborator, Franz Fischnaller, had already determined that the shape for the guide character would be a simple collection of geometric shapes. We hooked Franz up to four tracking sensors: one for the head, one for each arm, and one for the body. At the same time we ran a CAVE program that took the position and orientation information from the tracker and fed it to the parts of the character's body which were then displayed. So, as he moved Franz could immediately see how the character would move. Its' head moved as he moved his head. Its' arm waved as he moved his arm. In this way, we could build up animation for the individual body parts. Later, we could define keyframes to move the body as a whole along a path through the virtual world.

Building Dynamic Responses

Although motion-tracking and keyframing are useful in creating movements of characters and objects, they can't be used alone in VR. Because the virtual world is interactive, the user is an integral part of it, and the full progression of the story line isn't known in advance. We can pre-animate elements of the action such as walk cycles, dances and gestures, but these elements have to be combined dynamically in response to the user. Creating this level of interaction is easily the most challenging aspect of CAVE animation.

Building interaction for the CAVE means making objects intelligent and able to react to people. A simple example comes from The Thing Growing. At one point in the narrative the Thing becomes so angry with the user that it hides under one of the rocks in the vast plain that the action takes place on. This is the point when the other rocks come alive and start to stalk and herd the user. Intelligence has to be programmed into the rocks. They have to know where the user is. They have to avoid each other, and they have to sneak up on the user and try to trap him. Instead of movement based on keyframes, the rocks are given a set of rules on how to move until one grabs the user. When that happens, all the other rocks scatter and the user's ability to navigate is taken away. He is trapped with a rock slobbering on him.

For the Thing itself we are using motion tracking to build up a library of actions. In this case there are eight body parts: a head, two arms, the body and four tail sections. Each action lasts a few seconds and has a corresponding sound bite. As the program runs, the Thing's intelligence unit selects an appropriate action and sound according to the point in the narrative, the user's actions, and the Thing's own emotional state. The computer will interpolate between the end of one action and the beginning of the next, so that the movement is smooth.

The Thing's intelligence also decides how the body as a whole moves. This movement may be relative to the user as the Thing stays close or swoops in on the user or as it decides to move to a particular spot in the environment. All the while it has to avoid other objects. The computations for these movements are done on the fly, using a set of rules and information about the position of the user, the Thing, and the other objects.

The Thing has four basic moods: happy, depressed, manic and angry. Its' emotional state is established in part from information about the user. Essentially all that the computer, and therefore the Thing, can know about a user is the tracked position and orientation of the user's head and one, or both, hands. So, for example, we keep track of the user's head position relative to the Thing. If the user looks at the Thing most of the time, it interprets that as attentiveness and that makes it happy. We also monitor the general activity of the user. A user that moves around a lot is fast, and this will tend to make a happy Thing manic. If the Thing is not so happy, fast user movement will make it angry. Slow user movement will make it depressed. The Thing's emotional state will also fluctuate according to an internal set of rules, so that the emotions are not simply a reflection of the user. In fact, too much of any one emotion will flip it over into a different one.

In addition to monitoring the user in general, we check specific user movements. The Thing is attempting to teach the user a dance. It will demonstrate each part of the dance, then observe or join the user as he copies the movement. Each of the parts of the dance will be one of the Thing's actions. Also each of these particular actions will have a test associated with it to verify whether the user is dancing correctly. Knowing whether the user is dancing correctly will feed back both to the Thing's emotional component and to its' decision-making process. It may decide to repeat a part of the dance that the user is doing incorrectly. It will admonish, encourage or praise the user according to the user's behavior and its' own mood.

This process of developing an animated character focuses on maximizing the advantages of CAVE VR, the immersion into an interactive 3-D world, while minimizing the disadvantages, like the relative simplicity of the images. It was immediately obvious that any degree of photo-realism, and even the visual complexity of most computer animation, was impossible. This resulted in a decision to make a very clean, uncomplicated environment, and to create a virtual character who is simply a collection of pyramids. Capturing the motion for the Thing's head and limbs is the most efficient and effective way of giving this collection a life.

Thereafter, the most daunting task is to give that life intelligence. The starting point is assessing its' sensory inputs, the tracking information. In building the intelligence component, we interpret the data the Thing receives, but we also try to construct its' character so that the paucity of the information it is working with is not apparent. The Thing is high-handed and willful, in part because of the exigencies of the story line and in part to hide its' stupidity. It is also inconsistent, arbitrarily praising or abusing the user for the same behavior; again, in part to mimic the inconsistency of many people and in part to hide its' ignorance.

The Thing Growing is still under development. At this stage we can't judge how effective or real an experience it will give a user. One day, we hope that after testing and refining the program, it will be possible for the Thing to build a relationship with a user, however virtual that relationship may be.

Josephine Anstey is a Master of Fine Arts student and Dave Pape is a computer science PhD candidate at the Electronic Visualization Laboratory, University of Illinois at Chicago. Anstey's work focuses on the creation of virtual characters and interactive narrative; whereas Pape's research involves the underlying software for developing virtual reality applications.