BEAMER: Behavioral Encoder to Generate Multiple Appropriate Facial Reactions

Ximi Hoque, Adamay Mann, Gulshan Sharma, Abhinav Dhall

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

This paper presents a framework for generating appropriate facial expressions for a listener engaged in a dyadic conversation. The ability to produce contextually suitable facial gestures in response to user interactions may enhance the user experience for avatars and social robots interaction. We propose a Transformer and Siamese architecture-based approach for generating appropriate facial expressions. Positive and negative Speaker-Listener pairs are created, applying a contrastive loss to facilitate learning. Furthermore, an ensemble of reconstruction quality sensitive loss functions is added to the network for learning discriminative features. The listener's facial reactions are represented with a combination of the 3D Morphable Model's coefficients and affect-related attributes (facial action units). The inputs to the network are pre-trained Transformer-based feature MARLIN and affect-related features. Experimental analysis demonstrate the effectiveness of the proposed method across various metrics in the form of an increase in performance compared to a variational auto-encoder-based baseline.

Original languageEnglish
Title of host publicationMM '23 - Proceedings of the 31st ACM International Conference on Multimedia
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery, Inc
Pages9536-9540
Number of pages5
ISBN (Electronic)9798400701085
DOIs
Publication statusPublished - 27 Oct 2023
Externally publishedYes
Event31st ACM International Conference on Multimedia - Ottawa, Canada
Duration: 29 Oct 20233 Nov 2023
Conference number: 31st

Publication series

NameProceedings of the ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery
Volume2023

Conference

Conference31st ACM International Conference on Multimedia
Abbreviated titleMM 2023
Country/TerritoryCanada
CityOttawa
Period29/10/233/11/23

Keywords

  • behavioral encoder
  • contrastive learning
  • dyadic interactions
  • facial reactions generation
  • transformer

Fingerprint

Dive into the research topics of 'BEAMER: Behavioral Encoder to Generate Multiple Appropriate Facial Reactions'. Together they form a unique fingerprint.

Cite this