Automatic Depression Classification Based on Affective Read Sentences: Opportunities for Text-Dependent Analysis

Brian Stasak, Julien Epps, Roland Goecke

Research output: Contribution to journalArticle

Abstract

In the future, automatic speech-based analysis of mental health could become widely available to help augment conventional healthcare evaluation methods. For speech-based patient evaluations of this kind, protocol design is a key consideration. Read speech provides an advantage over other verbal modes (e.g. automatic, spontaneous) by providing a clinically stable and repeatable protocol. Further, text-dependent speech helps to reduce phonetic variability and delivers controllable linguistic/affective stimuli, therefore allowing more precise analysis of recorded stimuli deviations. The purpose of this study is to investigate speech disfluency behaviors in nondepressed/depressed speakers using read aloud text containing constrained affective-linguistic criteria. Herein, using the Black Dog Institute Affective Sentences (BDAS) corpus, analysis demonstrates statistically significant feature differences in speech disfluencies, whereby when compared to non-depressed speakers, depressed speakers show relatively higher recorded frequencies of hesitations (55% increase) and speech errors (71% increase). Our study examines both manually and automatically labeled speech disfluency features, demonstrating that detailed disfluency analysis leads to considerable gains, of up to 100% in absolute depression classification accuracy, especially with affective considerations, when compared with the affect-agnostic acoustic baseline (65%).
Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalSpeech Communication
Volume115
DOIs
Publication statusPublished - Dec 2019

Fingerprint

text analysis
Dependent
Linguistics
stimulus
linguistics
Speech analysis
Speech
Text
Affective
Evaluation Method
evaluation
phonetics
Healthcare
acoustics
Baseline
Acoustics
Health
Deviation
mental health
Disfluency

Cite this

@article{2237788c094f461db7b307b0d3dc7bbd,
title = "Automatic Depression Classification Based on Affective Read Sentences: Opportunities for Text-Dependent Analysis",
abstract = "In the future, automatic speech-based analysis of mental health could become widely available to help augment conventional healthcare evaluation methods. For speech-based patient evaluations of this kind, protocol design is a key consideration. Read speech provides an advantage over other verbal modes (e.g. automatic, spontaneous) by providing a clinically stable and repeatable protocol. Further, text-dependent speech helps to reduce phonetic variability and delivers controllable linguistic/affective stimuli, therefore allowing more precise analysis of recorded stimuli deviations. The purpose of this study is to investigate speech disfluency behaviors in nondepressed/depressed speakers using read aloud text containing constrained affective-linguistic criteria. Herein, using the Black Dog Institute Affective Sentences (BDAS) corpus, analysis demonstrates statistically significant feature differences in speech disfluencies, whereby when compared to non-depressed speakers, depressed speakers show relatively higher recorded frequencies of hesitations (55{\%} increase) and speech errors (71{\%} increase). Our study examines both manually and automatically labeled speech disfluency features, demonstrating that detailed disfluency analysis leads to considerable gains, of up to 100{\%} in absolute depression classification accuracy, especially with affective considerations, when compared with the affect-agnostic acoustic baseline (65{\%}).",
keywords = "Digital phenotyping, Digital medicine, Paralinguistics, Machine Learning, Speech Elicitation, Valence, Speech elicitation, Machine learning",
author = "Brian Stasak and Julien Epps and Roland Goecke",
year = "2019",
month = "12",
doi = "10.1016/j.specom.2019.10.003",
language = "English",
volume = "115",
pages = "1--14",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

Automatic Depression Classification Based on Affective Read Sentences: Opportunities for Text-Dependent Analysis. / Stasak, Brian; Epps, Julien; Goecke, Roland.

In: Speech Communication, Vol. 115, 12.2019, p. 1-14.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Automatic Depression Classification Based on Affective Read Sentences: Opportunities for Text-Dependent Analysis

AU - Stasak, Brian

AU - Epps, Julien

AU - Goecke, Roland

PY - 2019/12

Y1 - 2019/12

N2 - In the future, automatic speech-based analysis of mental health could become widely available to help augment conventional healthcare evaluation methods. For speech-based patient evaluations of this kind, protocol design is a key consideration. Read speech provides an advantage over other verbal modes (e.g. automatic, spontaneous) by providing a clinically stable and repeatable protocol. Further, text-dependent speech helps to reduce phonetic variability and delivers controllable linguistic/affective stimuli, therefore allowing more precise analysis of recorded stimuli deviations. The purpose of this study is to investigate speech disfluency behaviors in nondepressed/depressed speakers using read aloud text containing constrained affective-linguistic criteria. Herein, using the Black Dog Institute Affective Sentences (BDAS) corpus, analysis demonstrates statistically significant feature differences in speech disfluencies, whereby when compared to non-depressed speakers, depressed speakers show relatively higher recorded frequencies of hesitations (55% increase) and speech errors (71% increase). Our study examines both manually and automatically labeled speech disfluency features, demonstrating that detailed disfluency analysis leads to considerable gains, of up to 100% in absolute depression classification accuracy, especially with affective considerations, when compared with the affect-agnostic acoustic baseline (65%).

AB - In the future, automatic speech-based analysis of mental health could become widely available to help augment conventional healthcare evaluation methods. For speech-based patient evaluations of this kind, protocol design is a key consideration. Read speech provides an advantage over other verbal modes (e.g. automatic, spontaneous) by providing a clinically stable and repeatable protocol. Further, text-dependent speech helps to reduce phonetic variability and delivers controllable linguistic/affective stimuli, therefore allowing more precise analysis of recorded stimuli deviations. The purpose of this study is to investigate speech disfluency behaviors in nondepressed/depressed speakers using read aloud text containing constrained affective-linguistic criteria. Herein, using the Black Dog Institute Affective Sentences (BDAS) corpus, analysis demonstrates statistically significant feature differences in speech disfluencies, whereby when compared to non-depressed speakers, depressed speakers show relatively higher recorded frequencies of hesitations (55% increase) and speech errors (71% increase). Our study examines both manually and automatically labeled speech disfluency features, demonstrating that detailed disfluency analysis leads to considerable gains, of up to 100% in absolute depression classification accuracy, especially with affective considerations, when compared with the affect-agnostic acoustic baseline (65%).

KW - Digital phenotyping

KW - Digital medicine

KW - Paralinguistics

KW - Machine Learning

KW - Speech Elicitation

KW - Valence

KW - Speech elicitation

KW - Machine learning

UR - http://www.scopus.com/inward/record.url?scp=85073675194&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/automatic-depression-classification-based-affective-read-sentences-opportunities-textdependent-analy-1

U2 - 10.1016/j.specom.2019.10.003

DO - 10.1016/j.specom.2019.10.003

M3 - Article

VL - 115

SP - 1

EP - 14

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -