A Multilevel Fusion Approach for Audiovisual Emotion Recognition

Girija Chetty, Michael Wagner

Research output: A Conference proceeding or a Chapter in BookConference contribution

Abstract

The human computer interaction will be more natural if
computers are able to perceive and respond to human nonverbal communication such as emotions. Although several
approaches have been proposed to recognize human emotions
based on facial expressions or speech, relatively limited work
has been done to fuse these two, improve the accuracy and
robustness of the emotion recognition system. This paper
analyzes the strengths and the limitations of systems based
only on facial expressions or acoustic information. It also
analyses two approaches used to fuse these two modalities:
decision level and feature level integration, and proposes a
new multilevel fusion approach for enhancing the person
dependant and person independent classification performance
for different emotions. Two different audiovisual emotion
data corpora was used for the evaluating the proposed fusion
approach - DaFEx[1,2] and ENTERFACE[3] comprising
audiovisual emotion data from several actors eliciting five
different emotions – anger, disgust, fear, happiness, sadness
and surprise. The results of the experimental study reveal that
the system based on fusion of facial expression with acoustic
information yields better performance than the system based
on just acoustic information or facial expressions, for the
emotions considered. Results also show an improvement in
classification performance of different emotions with a
multilevel fusion approach as compared to either feature level
or score-level fusion.
Original languageEnglish
Title of host publicationProceedings of Audiovisual Speech Processing 2008
EditorsAmit Konar, Aruna Chakraborty
Place of PublicationAdelaide
PublisherAVISA
Pages115-120
Number of pages6
Volume2008
Publication statusPublished - 2008
EventAudiovisual Speech Processing 2008 - Moreton Island, Australia
Duration: 26 Sep 200829 Sep 2008

Conference

ConferenceAudiovisual Speech Processing 2008
CountryAustralia
CityMoreton Island
Period26/09/0829/09/08

Fingerprint

Fusion reactions
Electric fuses
Acoustics
Human computer interaction
Communication

Cite this

Chetty, G., & Wagner, M. (2008). A Multilevel Fusion Approach for Audiovisual Emotion Recognition. In A. Konar, & A. Chakraborty (Eds.), Proceedings of Audiovisual Speech Processing 2008 (Vol. 2008, pp. 115-120). Adelaide: AVISA.
Chetty, Girija ; Wagner, Michael. / A Multilevel Fusion Approach for Audiovisual Emotion Recognition. Proceedings of Audiovisual Speech Processing 2008. editor / Amit Konar ; Aruna Chakraborty. Vol. 2008 Adelaide : AVISA, 2008. pp. 115-120
@inproceedings{e44cde3c87e0400c9170f20952cd916b,
title = "A Multilevel Fusion Approach for Audiovisual Emotion Recognition",
abstract = "The human computer interaction will be more natural ifcomputers are able to perceive and respond to human nonverbal communication such as emotions. Although severalapproaches have been proposed to recognize human emotionsbased on facial expressions or speech, relatively limited workhas been done to fuse these two, improve the accuracy androbustness of the emotion recognition system. This paperanalyzes the strengths and the limitations of systems basedonly on facial expressions or acoustic information. It alsoanalyses two approaches used to fuse these two modalities:decision level and feature level integration, and proposes anew multilevel fusion approach for enhancing the persondependant and person independent classification performancefor different emotions. Two different audiovisual emotiondata corpora was used for the evaluating the proposed fusionapproach - DaFEx[1,2] and ENTERFACE[3] comprisingaudiovisual emotion data from several actors eliciting fivedifferent emotions – anger, disgust, fear, happiness, sadnessand surprise. The results of the experimental study reveal thatthe system based on fusion of facial expression with acousticinformation yields better performance than the system basedon just acoustic information or facial expressions, for theemotions considered. Results also show an improvement inclassification performance of different emotions with amultilevel fusion approach as compared to either feature levelor score-level fusion.",
author = "Girija Chetty and Michael Wagner",
year = "2008",
language = "English",
volume = "2008",
pages = "115--120",
editor = "Amit Konar and Aruna Chakraborty",
booktitle = "Proceedings of Audiovisual Speech Processing 2008",
publisher = "AVISA",

}

Chetty, G & Wagner, M 2008, A Multilevel Fusion Approach for Audiovisual Emotion Recognition. in A Konar & A Chakraborty (eds), Proceedings of Audiovisual Speech Processing 2008. vol. 2008, AVISA, Adelaide, pp. 115-120, Audiovisual Speech Processing 2008, Moreton Island, Australia, 26/09/08.

A Multilevel Fusion Approach for Audiovisual Emotion Recognition. / Chetty, Girija; Wagner, Michael.

Proceedings of Audiovisual Speech Processing 2008. ed. / Amit Konar; Aruna Chakraborty. Vol. 2008 Adelaide : AVISA, 2008. p. 115-120.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - A Multilevel Fusion Approach for Audiovisual Emotion Recognition

AU - Chetty, Girija

AU - Wagner, Michael

PY - 2008

Y1 - 2008

N2 - The human computer interaction will be more natural ifcomputers are able to perceive and respond to human nonverbal communication such as emotions. Although severalapproaches have been proposed to recognize human emotionsbased on facial expressions or speech, relatively limited workhas been done to fuse these two, improve the accuracy androbustness of the emotion recognition system. This paperanalyzes the strengths and the limitations of systems basedonly on facial expressions or acoustic information. It alsoanalyses two approaches used to fuse these two modalities:decision level and feature level integration, and proposes anew multilevel fusion approach for enhancing the persondependant and person independent classification performancefor different emotions. Two different audiovisual emotiondata corpora was used for the evaluating the proposed fusionapproach - DaFEx[1,2] and ENTERFACE[3] comprisingaudiovisual emotion data from several actors eliciting fivedifferent emotions – anger, disgust, fear, happiness, sadnessand surprise. The results of the experimental study reveal thatthe system based on fusion of facial expression with acousticinformation yields better performance than the system basedon just acoustic information or facial expressions, for theemotions considered. Results also show an improvement inclassification performance of different emotions with amultilevel fusion approach as compared to either feature levelor score-level fusion.

AB - The human computer interaction will be more natural ifcomputers are able to perceive and respond to human nonverbal communication such as emotions. Although severalapproaches have been proposed to recognize human emotionsbased on facial expressions or speech, relatively limited workhas been done to fuse these two, improve the accuracy androbustness of the emotion recognition system. This paperanalyzes the strengths and the limitations of systems basedonly on facial expressions or acoustic information. It alsoanalyses two approaches used to fuse these two modalities:decision level and feature level integration, and proposes anew multilevel fusion approach for enhancing the persondependant and person independent classification performancefor different emotions. Two different audiovisual emotiondata corpora was used for the evaluating the proposed fusionapproach - DaFEx[1,2] and ENTERFACE[3] comprisingaudiovisual emotion data from several actors eliciting fivedifferent emotions – anger, disgust, fear, happiness, sadnessand surprise. The results of the experimental study reveal thatthe system based on fusion of facial expression with acousticinformation yields better performance than the system basedon just acoustic information or facial expressions, for theemotions considered. Results also show an improvement inclassification performance of different emotions with amultilevel fusion approach as compared to either feature levelor score-level fusion.

M3 - Conference contribution

VL - 2008

SP - 115

EP - 120

BT - Proceedings of Audiovisual Speech Processing 2008

A2 - Konar, Amit

A2 - Chakraborty, Aruna

PB - AVISA

CY - Adelaide

ER -

Chetty G, Wagner M. A Multilevel Fusion Approach for Audiovisual Emotion Recognition. In Konar A, Chakraborty A, editors, Proceedings of Audiovisual Speech Processing 2008. Vol. 2008. Adelaide: AVISA. 2008. p. 115-120