The consistency and stability of acoustic and visual cues for different prosodic attitudes

Jeesun Kim, Chris Wayne Davis

Research output: Contribution to conferencePaperpeer-review

1 Citation (Scopus)

Abstract

Recently it has been argued that speakers use conventionalized forms to express different prosodic attitudes [1]. We examined this by looking at across speaker consistency in the expression of auditory and visual (head and face motion) prosodic attitudes produced on multiple different occasions. Specifically, we examined acoustic and motion profiles of a female and a male speaker expressing six different prosodic attitudes for four within-session repetitions across four different sessions. We used the same acoustic features as [1] and visual prosody was assessed by examining patterns of speaker's mouth, eyebrow and head movements. There was considerable variation in how prosody was realized across speakers, with the productions of one speaker more discriminable than the other. Within-session variation for both the acoustic and movement data was smaller than acrosssession variation, suggesting that short-term memory plays a role in consistency. The expression of some attitudes was less variable than others and better discrimination was found with the acoustic compared to the visual data, although certain visual features (e.g., eyebrow brow motion) provided better discrimination than others.

Original languageEnglish
Pages57-61
Number of pages5
DOIs
Publication statusPublished - 2016
Event17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 -
Duration: 8 Sep 2016 → …

Conference

Conference17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
Period8/09/16 → …

Fingerprint Dive into the research topics of 'The consistency and stability of acoustic and visual cues for different prosodic attitudes'. Together they form a unique fingerprint.

Cite this