During public addresses, speakers accompany their discourse with spontaneous hand gestures (beats) that are tightly synchronized with the prosodic contour of the discourse. It has been proposed that speech and beat gestures originate from a common underlying linguistic process whereby both speech prosody and beats serve to emphasize relevant information. We hypothesized that breaking the consistency between beats and prosody by temporal desynchronization, would modulate activity of brain areas sensitive to speech–gesture integration. To this aim, we measured BOLD responses as participants watched a natural discourse where the speaker used beat gestures.

In order to identify brain areas specifically involved in processing hand gestures with communicative intention, beat synchrony was evaluated against arbitrary visual cues bearing equivalent rhythmic and spatial properties as the gestures. Our results revealed that left MTG and IFG were specifically sensitive to speech synchronized with beats, compared to the arbitrary vision–speech pairing. Our results suggest that listeners confer beats a function of visual prosody, complementary to the prosodic structure of speech. We conclude that the emphasizing function of beat gestures in speech perception is instantiated through a specialized brain network sensitive to the communicative intent conveyed by a speaker with his/her hands.