Abstract
This highly inter-disciplinary PhD project addresses the problem of multimodal affective sensing witha focus on developing objective measures for depression analysis using multimodal cues, such as facial
expressions, vocal expressions, head movements and heart rate variability. As the depression severity of
a subject increases, the facial movements become very subtle. In order to quantify depression and its
subtypes, these changes need to be revealed. A particular focus of this research is to improve the ability
of affective computing approaches to sense the subtle expressions of affect in face, voice and head pose,
and to design and implement approaches to analyse depression severity.
Depression and other mood disorders are common, disabling disorders with a profound impact on
individuals and families. The landmark WHO 1990 Global Burden of Disease (GBD) report and WHO
2004 GBD Update quantified depression as the leading cause of disability worldwide and projected it
to be the second-leading cause of disease burden by 2020. By 2030 depression is expected to be the
largest single healthcare burden, costing US Dollar 6 trillion globally. According to the 2007 National
Survey of Mental Health and Wellbeing (SMHWB) by the Australian Bureau of Statistics (ABS), of the
16 million Australians aged 16-85 years, almost half (45% or 7.3 million) had a lifetime prevalence of
a mental health disorder. One in five (20% or 3.2 million) Australians had a 12-month prevalence of a
mental health disorder. Despite the high prevalence, current clinical practice depends almost exclusively
on self-report and clinical opinion, risking a range of subjective biases. There are currently no objective
laboratory-based measures of the course and recovery for depression, and no objective markers for
therapy in clinical settings. This compromises optimal patient care increasing the burden of disability.
The research presented in this thesis has addressed some of the challenges in affective computing
specific to depression analysis: (i) investigated and improved the sensitivity and specificity of affective
computing approaches by multimodal fusion of audio and video cues and demonstrated that these methods
can successfully distinguish subtypes of depression, (ii) demonstrated that non-invasive estimation
of heart rate from facial videos can be used as a modality for depression analysis, and (iii) investigated
interpersonal coordination of head movement between patients and therapists in dyadic depression severity
interviews, results of which indicate a strong effect for patient-therapist head movement coordination.
The results also demonstrate that interpersonal coordination of head movement varies with change in depression severity.
The investigations have been exemplified using two affective sensing datasets: (i) The
Black Dog Institute dataset, and (ii) The University of Pittsburgh dataset around the benchmark problem
of quantifying depression and melancholia. These results will assist future developments towards more
fine-grained depression severity estimation and analysis.
Date of Award | 2021 |
---|---|
Original language | English |
Supervisor | Roland Goecke (Supervisor), Michael Wagner (Supervisor) & Munawar Hayat (Supervisor) |