Perceiving emotion from a talker: How face and voice work together

Jeesun Kim, Chris Wayne Davis

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)


The experiment investigated how the addition of emotion information from the voice affects the identification of facial emotion. We presented whole face, upper face, and lower face displays and examined correct recognition rates and patterns of response confusions for auditory-visual (AV), auditory-only (AO), and visual-only (VO) expressive speech. Emotion recognition accuracy was superior for AV compared to unimodal presentation. The pattern of response confusions differed across the unimodal conditions and across display type. For AV presentation, a response confusion only occurred when such a confusion was present in each modality separately, thus response confusions were reduced compared to unimodal presentations. Emotion space (calculated from the confusion data) differed across display types for the VO presentations but was more similar for the AV ones indicating that the addition of the auditory information acted to harmonize the various VO response patterns. These results are discussed with respect to how bimodal emotion recognition combines auditory and visual information.

Original languageEnglish
Pages (from-to)902-921
Number of pages20
JournalVisual Cognition
Issue number8
Publication statusPublished - Sept 2012


  • Bimodal processing
  • Emotion perception
  • Face and voice expression
  • Perceptual confusions


Dive into the research topics of 'Perceiving emotion from a talker: How face and voice work together'. Together they form a unique fingerprint.

Cite this