Multimodal approach for automatic depression analysis

  • Jyoti J. Dhall

Student thesis: Doctoral Thesis

Abstract

Depression is one of the most common and disabling mental disorders, and has a major impact on society. The landmark WHO 2004 Global Burden of Disease report quantified depression as the leading cause of disability worldwide (an estimated 154 million sufferers). Fortunately, depression can be ameliorated through the provision of suitable objective technology for detecting depression. Disturbances in the expression of affect reflect changes in mood and interpersonal style, and are arguably a key index of a current depressive episode. This leads directly to impaired interpersonal functioning, causing a range of interpersonal disabilities, functioning in the workforce, absenteeism and difficulties with a range of everyday tasks (such as shopping). Whilst these are a constant source of distress in affected subjects, the economic impact of mental health disorders through direct and indirect costs has long been underestimated. Despite its severity and high prevalence, there currently exist no laboratory-based objective measures of illness expression, course and recovery. This compromises optimal patient care, compounding the burden of disability. As healthcare costs increase worldwide, the provision of effective health monitoring systems and diagnostic aids is highly important. With the advancement of affective sensing and machine learning, computer aided diagnosis can and will play a major role in providing an objective assessment. The research presented in this thesis addresses some of the key issues of automatic depression analysis mentioned as follows: 1) analysing geometrical and appearance descriptors for depression analysis; 2) the role of upper body movements in detecting depression; 3) fusion of audio and video channels; 4) relative body parts movement analysis for depression detection. The central hypothesis of the thesis is that using different modalities and information from body parts, will lead to more accurate detection of depression. To validate the approaches, clinically approved datasets from the Black Dog Institute, Sydney, and the University of Pittsburgh, USA, are used. First, subject-dependent active appearance model based geometrical descriptors are computed. Subject independent parts based models are applied in parallel to extract texture descriptors. A thorough comparison is made on the Black Dog Institute clinical data. Furthermore, head movements are computed using the fiducial points and a histogram is constructed. Space Time Interest Points are also computed on the upper body to capture subtle gestures, which can provide discriminative information. The speech signal is also analysed and its feature descriptors are combined with visual information extracted from face and upper body, respectively. Various fusion scenarios are studied in this multimodal framework. The contribution of body expressions are further explored by proposing a relative part movement framework and validating it on the University of Pittsburgh data. To the best of my knowledge, this is the first work in affective computing community to use body expressions for detecting depression. The results presented in this thesis show that, as hypothesised, the multimodal framework outperforms uni-modal approaches in the task of classifying between depressed patients and healthy controls. Moreover, the body expressions, used as an auxiliary modality, provide significant discriminating information for depression recognition.
Date of Award2016
Original languageEnglish
Awarding Institution
  • University of Canberra
SupervisorRoland Goecke (Supervisor), Elisa Martinez-Marroquin (Supervisor), Michael Wagner (Supervisor) & Tom Gedeon (Supervisor)

Cite this

'