Teaching

Perception and Interpretation

Visual scene interpretation is an important ability for intelligent systems. We realize this ability as a complete loop from extracting bottom-up information from visual data, fusing these information with top-down knowledge, and improving bottom-up processing by feeding back interpretations as relevance or context information.

The interpretation of facial expressions, head gestures, and prosodic information are important non-verbal cues for intelligent systems. The usage of these cues enables such a system to gain information about the mental state of the user and the quality of an interaction.

Multi-modal perception of an agent's environment is crucial for interacting naturally with human partners in non-laboratory settings. This applies to robots as well as to intelligent smart-homes or interactive virtual agents. We use multi-modal perception of activities on a global and local human-centered level.