Humans optimize behavior by deriving context-based expectations. Contextual data that are important for survival are extracted rapidly, using coarse information, adaptive decision strategies, and dedicated neural infrastructure. In the field of object perception, the influence of a surrounding context has been a major research theme, and it has generated a large literature. That visual context, as typically provided by natural scenes, facilitates object recognition as has been convincingly demonstrated (Bar, M. (2004) Nat. Rev. Neurosci., 5: 617-629). Just like objects, faces are generally encountered as part of a natural scene. Thus far, the facial expression literature has neglected such context and treats facial expressions as if they stand on their own. This constitutes a major gap in our knowledge. Facial expressions tend to appear in a context of head and body orientations, body movements, posture changes, and other object-related actions with a similar or at least a closely related meaning. For instance, one would expect a frightened face when confronted to an external danger to be at least accompanied by withdrawal movements of head and shoulders. Furthermore, some cues provided by the environment or the context in which a facial expression appears may have a direct relation with the emotion displayed by the face. The brain may even fill in the natural scene context typically associated with the facial expression. Recognition of the facial expression may also profit from processing the vocal emotion as well as the emotional body language that normally accompany it. Here we review the emerging evidence on how the immediate visual and auditory contexts influence the recognition of facial expressions.