TY - GEN
T1 - CNN Depression Severity Level Estimation from Upper Body vs. Face-Only Images
AU - Ahmad, Dua’a
AU - Goecke, Roland
AU - Ireland, James
N1 - Funding Information:
This research was supported partially by the Australian Government through the Australian Research Council?s Discovery Projects funding scheme (project DP190101294).
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/2/25
Y1 - 2021/2/25
N2 - Upper body gestures have proven to provide more information about a person’s depressive state when added to facial expressions. While several studies on automatic depression analysis have looked into this impact, little is known in regard to how a convolutional neural network (CNN) uses such information for predicting depression severity levels. This study investigates the performance in various CNN models when looking at facial images alone versus including the upper body when estimating depression severity levels on a regressive scale. To assess generalisability of CNN model performance, two vastly different datasets were used, one collected by the Black Dog Institute and the other being the 2013 Audio/Visual Emotion Challenge (AVEC). Results show that the differences in model performance between face versus upper body are slight, as model performance across multiple architectures is very similar but varies when different datasets are introduced.
AB - Upper body gestures have proven to provide more information about a person’s depressive state when added to facial expressions. While several studies on automatic depression analysis have looked into this impact, little is known in regard to how a convolutional neural network (CNN) uses such information for predicting depression severity levels. This study investigates the performance in various CNN models when looking at facial images alone versus including the upper body when estimating depression severity levels on a regressive scale. To assess generalisability of CNN model performance, two vastly different datasets were used, one collected by the Black Dog Institute and the other being the 2013 Audio/Visual Emotion Challenge (AVEC). Results show that the differences in model performance between face versus upper body are slight, as model performance across multiple architectures is very similar but varies when different datasets are introduced.
KW - Convolutional neural networks
KW - Depression severity
KW - Spatial analysis
UR - http://www.scopus.com/inward/record.url?scp=85103294569&partnerID=8YFLogxK
UR - https://iapr.org/archives/mprss2020/index.html
U2 - 10.1007/978-3-030-68780-9_56
DO - 10.1007/978-3-030-68780-9_56
M3 - Conference contribution
AN - SCOPUS:85103294569
SN - 9783030687793
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 744
EP - 758
BT - Pattern Recognition. ICPR International Workshops and Challenges, 2021, Proceedings
A2 - Del Bimbo, Alberto
A2 - Cucchiara, Rita
A2 - Sclaroff, Stan
A2 - Farinella, Giovanni Maria
A2 - Mei, Tao
A2 - Bertini, Marco
A2 - Escalante, Hugo Jair
A2 - Vezzani, Roberto
PB - Springer
CY - Netherlands
T2 - 25th International Conference on Pattern Recognition Workshops, ICPR 2020
Y2 - 10 January 2021 through 15 January 2021
ER -