TY - JOUR
T1 - Multi-modal Framework for Analyzing the Affect of a Group of People
AU - Huang, Xiaohua
AU - Dhall, Abhinav
AU - Goecke, Roland
AU - Pietikainen, Matti
AU - Zhao, Guoying
PY - 2018/10
Y1 - 2018/10
N2 - With the advances in multimedia and the world wide web, users upload millions of images and videos everyone on social networking platforms on the Internet. From the perspective of automatic human behavior understanding, it is of interest to analyze and model the affects that are exhibited by groups of people who are participating in social events in these images. However, the analysis of the affect that is expressed by multiple people is challenging due to the varied indoor and outdoor settings. Recently, a few interesting works have investigated face-based Group-level Emotion Recognition (GER). In this paper, we propose a multi-modal framework for enhancing the affective analysis ability of GER in challenging environments. Specifically, for encoding a person's information in a group-level image, we first propose an information aggregation method for generating feature descriptions of face, upper body and scene. Later, we revisit localized multiple kernel learning for fusing face, upper body and scene information for GER against challenging environments. Intensive experiments are performed on two challenging group-level emotion databases (HAPPEI and GAFF) to investigate the roles of the face, upper body, scene information and the multi-modal framework. Experimental results demonstrate that the multi-modal framework achieves promising performance for GER.
AB - With the advances in multimedia and the world wide web, users upload millions of images and videos everyone on social networking platforms on the Internet. From the perspective of automatic human behavior understanding, it is of interest to analyze and model the affects that are exhibited by groups of people who are participating in social events in these images. However, the analysis of the affect that is expressed by multiple people is challenging due to the varied indoor and outdoor settings. Recently, a few interesting works have investigated face-based Group-level Emotion Recognition (GER). In this paper, we propose a multi-modal framework for enhancing the affective analysis ability of GER in challenging environments. Specifically, for encoding a person's information in a group-level image, we first propose an information aggregation method for generating feature descriptions of face, upper body and scene. Later, we revisit localized multiple kernel learning for fusing face, upper body and scene information for GER against challenging environments. Intensive experiments are performed on two challenging group-level emotion databases (HAPPEI and GAFF) to investigate the roles of the face, upper body, scene information and the multi-modal framework. Experimental results demonstrate that the multi-modal framework achieves promising performance for GER.
KW - Emotion recognition
KW - Face
KW - Facial expression recognition
KW - Facial features
KW - Group-level emotion recognition
KW - Multi-modality
KW - Group affect
UR - http://www.scopus.com/inward/record.url?scp=85044348664&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/c06cbfb0-fcd7-3fb9-81c7-48083d301b07/
U2 - 10.1109/TMM.2018.2818015
DO - 10.1109/TMM.2018.2818015
M3 - Article
AN - SCOPUS:85044348664
SN - 1520-9210
VL - 20
SP - 2706
EP - 2721
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 10
M1 - 8323249
ER -