Multimodal assistive technologies for depression diagnosis and monitoring

Jyoti Joshi, Roland GOECKE, Sharifa Alghowinem, Abhinav Dhall, Michael WAGNER, Julien Epps, Gordon Parker, Michael Breakspear

    Research output: Contribution to journalArticle

    53 Citations (Scopus)
    2 Downloads (Pure)

    Abstract

    Depression is a severe mental health disorder with high societal costs. Current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. The long-term goal of our research is to develop assistive technologies to support clinicians and sufferers in the diagnosis and monitoring of treatment progress in a timely and easily accessible format. In the first phase, we aim to develop a diagnostic aid using affective sensing approaches. This paper describes the progress to date and proposes a novel multimodal framework comprising of audio-video fusion for depression diagnosis. We exploit the proposition that the auditory and visual human communication complement each other, which is well-known in auditory-visual speech processing; we investigate this hypothesis for depression analysis. For the video data analysis, intra-facial muscle movements and the movements of the head and shoulders are analysed by computing spatio-temporal interest points. In addition, various audio features (fundamental frequency f0, loudness, intensity and mel-frequency cepstral coefficients) are computed. Next, a bag of visual features and a bag of audio features are generated separately. In this study, we compare fusion methods at feature level, score level and decision level. Experiments are performed on an age and gender matched clinical dataset of 30 patients and 30 healthy controls. The results from the multimodal experiments show the proposed framework’s effectiveness in depression analysis.
    Original languageEnglish
    Pages (from-to)217-228
    Number of pages12
    JournalJournal on Multimodal User Interfaces
    Volume7
    Issue number3
    DOIs
    Publication statusPublished - 2013

    Fingerprint

    Fusion reactions
    Speech processing
    Monitoring
    Muscle
    Experiments
    Health
    Communication
    Costs

    Cite this

    Joshi, Jyoti ; GOECKE, Roland ; Alghowinem, Sharifa ; Dhall, Abhinav ; WAGNER, Michael ; Epps, Julien ; Parker, Gordon ; Breakspear, Michael. / Multimodal assistive technologies for depression diagnosis and monitoring. In: Journal on Multimodal User Interfaces. 2013 ; Vol. 7, No. 3. pp. 217-228.
    @article{eb28015f18ea4904b6d58f3941d0710d,
    title = "Multimodal assistive technologies for depression diagnosis and monitoring",
    abstract = "Depression is a severe mental health disorder with high societal costs. Current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. The long-term goal of our research is to develop assistive technologies to support clinicians and sufferers in the diagnosis and monitoring of treatment progress in a timely and easily accessible format. In the first phase, we aim to develop a diagnostic aid using affective sensing approaches. This paper describes the progress to date and proposes a novel multimodal framework comprising of audio-video fusion for depression diagnosis. We exploit the proposition that the auditory and visual human communication complement each other, which is well-known in auditory-visual speech processing; we investigate this hypothesis for depression analysis. For the video data analysis, intra-facial muscle movements and the movements of the head and shoulders are analysed by computing spatio-temporal interest points. In addition, various audio features (fundamental frequency f0, loudness, intensity and mel-frequency cepstral coefficients) are computed. Next, a bag of visual features and a bag of audio features are generated separately. In this study, we compare fusion methods at feature level, score level and decision level. Experiments are performed on an age and gender matched clinical dataset of 30 patients and 30 healthy controls. The results from the multimodal experiments show the proposed framework’s effectiveness in depression analysis.",
    keywords = "Depression analysis, Multimodal, LBP-TOP",
    author = "Jyoti Joshi and Roland GOECKE and Sharifa Alghowinem and Abhinav Dhall and Michael WAGNER and Julien Epps and Gordon Parker and Michael Breakspear",
    year = "2013",
    doi = "10.1007/s12193-013-0123-2",
    language = "English",
    volume = "7",
    pages = "217--228",
    journal = "Journal on Multimodal User Interfaces",
    issn = "1783-7677",
    publisher = "Springer Verlag",
    number = "3",

    }

    Joshi, J, GOECKE, R, Alghowinem, S, Dhall, A, WAGNER, M, Epps, J, Parker, G & Breakspear, M 2013, 'Multimodal assistive technologies for depression diagnosis and monitoring', Journal on Multimodal User Interfaces, vol. 7, no. 3, pp. 217-228. https://doi.org/10.1007/s12193-013-0123-2

    Multimodal assistive technologies for depression diagnosis and monitoring. / Joshi, Jyoti; GOECKE, Roland; Alghowinem, Sharifa; Dhall, Abhinav; WAGNER, Michael; Epps, Julien; Parker, Gordon; Breakspear, Michael.

    In: Journal on Multimodal User Interfaces, Vol. 7, No. 3, 2013, p. 217-228.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Multimodal assistive technologies for depression diagnosis and monitoring

    AU - Joshi, Jyoti

    AU - GOECKE, Roland

    AU - Alghowinem, Sharifa

    AU - Dhall, Abhinav

    AU - WAGNER, Michael

    AU - Epps, Julien

    AU - Parker, Gordon

    AU - Breakspear, Michael

    PY - 2013

    Y1 - 2013

    N2 - Depression is a severe mental health disorder with high societal costs. Current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. The long-term goal of our research is to develop assistive technologies to support clinicians and sufferers in the diagnosis and monitoring of treatment progress in a timely and easily accessible format. In the first phase, we aim to develop a diagnostic aid using affective sensing approaches. This paper describes the progress to date and proposes a novel multimodal framework comprising of audio-video fusion for depression diagnosis. We exploit the proposition that the auditory and visual human communication complement each other, which is well-known in auditory-visual speech processing; we investigate this hypothesis for depression analysis. For the video data analysis, intra-facial muscle movements and the movements of the head and shoulders are analysed by computing spatio-temporal interest points. In addition, various audio features (fundamental frequency f0, loudness, intensity and mel-frequency cepstral coefficients) are computed. Next, a bag of visual features and a bag of audio features are generated separately. In this study, we compare fusion methods at feature level, score level and decision level. Experiments are performed on an age and gender matched clinical dataset of 30 patients and 30 healthy controls. The results from the multimodal experiments show the proposed framework’s effectiveness in depression analysis.

    AB - Depression is a severe mental health disorder with high societal costs. Current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. The long-term goal of our research is to develop assistive technologies to support clinicians and sufferers in the diagnosis and monitoring of treatment progress in a timely and easily accessible format. In the first phase, we aim to develop a diagnostic aid using affective sensing approaches. This paper describes the progress to date and proposes a novel multimodal framework comprising of audio-video fusion for depression diagnosis. We exploit the proposition that the auditory and visual human communication complement each other, which is well-known in auditory-visual speech processing; we investigate this hypothesis for depression analysis. For the video data analysis, intra-facial muscle movements and the movements of the head and shoulders are analysed by computing spatio-temporal interest points. In addition, various audio features (fundamental frequency f0, loudness, intensity and mel-frequency cepstral coefficients) are computed. Next, a bag of visual features and a bag of audio features are generated separately. In this study, we compare fusion methods at feature level, score level and decision level. Experiments are performed on an age and gender matched clinical dataset of 30 patients and 30 healthy controls. The results from the multimodal experiments show the proposed framework’s effectiveness in depression analysis.

    KW - Depression analysis

    KW - Multimodal

    KW - LBP-TOP

    U2 - 10.1007/s12193-013-0123-2

    DO - 10.1007/s12193-013-0123-2

    M3 - Article

    VL - 7

    SP - 217

    EP - 228

    JO - Journal on Multimodal User Interfaces

    JF - Journal on Multimodal User Interfaces

    SN - 1783-7677

    IS - 3

    ER -