A Multilevel Fusion Approach for Audiovisual Emotion Recognition

Girija Chetty, Michael Wagner

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

6 Citations (Scopus)


The human computer interaction will be more natural if
computers are able to perceive and respond to human nonverbal communication such as emotions. Although several
approaches have been proposed to recognize human emotions
based on facial expressions or speech, relatively limited work
has been done to fuse these two, improve the accuracy and
robustness of the emotion recognition system. This paper
analyzes the strengths and the limitations of systems based
only on facial expressions or acoustic information. It also
analyses two approaches used to fuse these two modalities:
decision level and feature level integration, and proposes a
new multilevel fusion approach for enhancing the person
dependant and person independent classification performance
for different emotions. Two different audiovisual emotion
data corpora was used for the evaluating the proposed fusion
approach - DaFEx[1,2] and ENTERFACE[3] comprising
audiovisual emotion data from several actors eliciting five
different emotions – anger, disgust, fear, happiness, sadness
and surprise. The results of the experimental study reveal that
the system based on fusion of facial expression with acoustic
information yields better performance than the system based
on just acoustic information or facial expressions, for the
emotions considered. Results also show an improvement in
classification performance of different emotions with a
multilevel fusion approach as compared to either feature level
or score-level fusion.
Original languageEnglish
Title of host publicationProceedings of Audiovisual Speech Processing 2008
EditorsAmit Konar, Aruna Chakraborty
Place of PublicationAdelaide
Number of pages6
Publication statusPublished - 2008
EventAudiovisual Speech Processing 2008 - Moreton Island, Australia
Duration: 26 Sept 200829 Sept 2008


ConferenceAudiovisual Speech Processing 2008
CityMoreton Island


Dive into the research topics of 'A Multilevel Fusion Approach for Audiovisual Emotion Recognition'. Together they form a unique fingerprint.

Cite this