Building an AI Sensory System: Examining The Design of Thief: The Dark Project

The term "senses" in game development is a useful metaphor for understanding, designing, and discussing that part of the AI that gathers information about items of interest in the simulated environment of the game. Non-player characters visually presented as humans, animals, or creatures with eyes and ears in a realistic three-dimensional space lend themselves well to the metaphor.

This engineering metaphor is not best applied too literally. In spite of the seemingly physical nature of the AIs in the game world, the analogy of game AI senses is not a physiological or neurological one. The line between "sense" and "knowledge" in a game is a blurry one. Sense incorporates the idea of awareness of another entity in the game, includes elements of value and knowledge, and can have game-relevant logic wired directly in.

A game sensory system must be designed in a way that is subservient to the game design and efficient in implementation. The senses need only be as sophisticated as is needed to be entertaining and robust. The result of their work must be perceivable and understandable by the player. Few game designs require AIs with a sense of taste, touch, or smell; thus senses primarily are concerned with vision or hearing. Used wisely, senses can be an invaluable tool to make simple state machines more interesting by providing them with a broad range of environmental input.

This paper describes an approach to designing and implementing a high-fidelity sensory system for a stealth-oriented first-person AI system. The techniques described are derived from experience constructing the AI for Thief: The Dark Project, as well as familiarity with the code of Half-Life. Initially, the basic concepts of AI senses are laid out using Half-Life as a motivating example. The paper then examines the more stringent sensory requirements of a stealth game design. Finally, the sensory system built for Thief is described.

An Introductory Example: Half-Life

Half-Life is not a game that centers on stealth and senses. With a strong tactical combat element, however, it does require a reasonable sensory system. This makes it a perfect case to explore the basics of AI sensory systems. AIs in Half-Life have sight and hearing, a system for managing information about sensed entities, and present interesting examples of leveraging basic senses into appealing behaviors.

In a simple sensory system, AIs periodically "look" at and "listen" to the world. Unlike real vision and hearing where stimuli arrive at the senses whether desired or not, these are active events. The AI examines the world based on its interest, and decides according to a set of rules that it sees or hears another element in the game. These probes are designed to emulate real senses while limiting the amount of work done. A greater amount of resources is dedicated to the things that are important for the game mechanics.
For example, in Half-Life the core sensory logic that is run periodically is:

If I am close to player then...

Begin look
--Gather a list of entities within a specified distance
--For each entity found...
----If I want to look for them and
----If they are in my viewcone and
----If I can raycast from my eyes to their eyes then...
------If they are the player and
------If I have been told to not see the player until they see me and
------If they do not see me
--------End look
------Else
--------Set various signals depending on my relationship with the seen --------entity
End look

Begin listen
--For each sound being played...
----If the sound is carrying to my ears...
------Add the sound to a list of heard sounds
------If the sound is a real sound...
--------Set a signal indicating heard something
------If the sound is a "smell" pseudo-sound
--------Set a signal indicating smelled something
End listen

The first concept illustrated by this pseudo-code is that the senses are closely tied to the properties of the AI, its relationship with the subject, and the relevance of the AI to the player's experience. This is in part motivated by optimization concerns, but made available by game mechanics. In the Half-Life game design an AI that is not near the player is not relevant and need not sense the world. Even when near the player, the AI needs only to look at things that are known to produce reactions of fear or hatred later.

The logic also demonstrates the basic construction of vision as a view distance, a view cone, line-of-sight, and eye position (Figure 1). Each AI has a length-limited two-dimensional field of view within which it will cast rays to interesting objects. Unblocked ray casts indicate visibility.

Figure 1

There are two important things to note. First, the operations of sensing are ordered from least expensive to most expensive. Second, for player satisfaction, vision is a game of peek-a-boo. In a first-person game, the player's sense of body is weak, and the player seen by an opponent they do not see often feels cheated.

Most interesting is the snippet that restrains the AI's ability to see the player until seen by the player, which is purely for coordinating the player's entertainment. This is an example of how higher-level game goals can be simply and elegantly achieved by simple techniques in lower level systems.

The logic for hearing is much simpler than vision. The basic element of a hearing component is the definition and tuning of what it means for a sound to carry to the AI audibly. In the case of Half-Life, hearing is a straightforward heuristic of the volume of the sound multiplied by a "hearing sensitivity" yielding a distance within which the AI hears the sound. More interesting is the demonstration of the utility of hearing as a catchall for general world information gathering. In this example, the AI "hears" pseudo-sounds, fictional smells emanating from nearby corpses.

Senses as Gameplay Focus: Thief

Thief: The Dark Project and its successors present a lightly scripted game world where the central game mechanic, stealth, challenges the traditional form of the first-person 3D game. The Thief player moves slowly, avoids conflict, is penalized for killing people, and is entirely mortal. The gameplay centers on the ebb and flow AI sensory knowledge of the player as they move through the game space. The player is expected to move through areas populated with stationary, pacing, and patrolling AIs without being detected, creeping among shadows and careful not to make alerting sounds. Though the game AI's senses are built on the same core concepts as those of Half-Life, the mechanics of skulking, evading, and surprising require a more sophisticated sensory system.

The primary requirement was creating a highly tunable sensory system that operated within a wide spectrum of states. On the surface, stealth gameplay is about fictional themes of hiding, evasion, surprise, quiet, light and dark. One of the things that makes that kind of experience fun is broadening out the gray zone of safety and danger that in most first-person games is razor thin. It's about getting the payer's heart pounding by holding them on the cusp of either state, then letting loose once the zone is crossed. This demanded "broad-spectrum" senses that didn't tend to polarize rapidly to the extremes of "player sensed" and "player not sensed."

A secondary requirement was that the sense system be active much more frequently and operating on more objects than is typical of a first-person shooter. During the course of the game, the player can alter the state of the world in ways that the AIs are supposed to take notice of, even when the player is not around. These things, like body hiding, require reliable sensing. Together with the first requirement, these created an interesting challenge when weighed against the perennial requirement for game developers: performance.

Finally, it was necessary that both players and designers understand the inputs and outputs of the sensory system, and that the outputs match learned expectations based on the inputs. This suggested a solution with a limited number of player-perceivable inputs, and discrete valued results.

Expanding the Senses

At heart, the sensory system described here is very similar to that found in Half-Life. It is a viewcone and raycast based vision system and simple hearing system with hooks to support optimization, game mechanics, and pseudo-sensory data. Like the Half-Life example, most of the sense gathering is decoupled from the decision process that acts on that information. This system expands some of these core ideas, and introduces a few new ones.

Figure 2, Basic components and relationships

The design of the system and the flow of data through it are derived from its definition as an information gathering system that is customizable and tunable, but stable and intelligible in its output.

In this system, AI senses are framed in terms of "awareness." Awareness is expressed as a range of discrete states that represent an AI's certainty about the presence, location, and identity of an object of interest. These discrete states are the only representation of the internals of the system exposed to the designer, and are correlated by the higher-level AI to an alertness state. In Thief's AI, the range of alertness states is similar to awareness states. The alertness state of the AI is fed back into the sensory system in various ways to alter the behavior of the system.

Awareness is stored in sense links that associate either a given AI to another entity in the game, or to a position in space. These relations store game relevant details of the sensing (time, location, line-of-sight, etc.), as well as cached values used to reduce calculations from think cycle to think cycle. Sense links are, in effect, the primary memory of the AI. Through verbalization and observation sense links can be propagated among peer AIs, with controls in place to constrain knowledge cascades across a level. They may also be manipulated by game logic after base processing.

Figure 3, Sense Links

Each object of interest in the game has an intrinsic visibility value independent of any viewer. Depending on the state of the game and the nature of the object the level of detail of this value and the frequency of update are scaled in order to keep the amount of processor time spent deriving the value within budgets.

Visibility is defined as the lighting, movement, and exposure (size, separation from other objects) of the entity. The meaning of these is closely tied to the game requirements. For example, the lighting of the player is biased towards the lighting near the floor below the player, as this provides the player with an objective, perceivable way to anticipate their own safety. These values and their aggregate sum visibility are stored as 0..1 analog values.