Modeling Spectral Variability for the Classification of Depressed Speech

Nicholas Cummins, Julien Epps, Vidhyasaharan Sethu, Michael Breakspear, Roland GOECKE

Research output: A Conference proceeding or a Chapter in BookConference contribution

27 Citations (Scopus)
2 Downloads (Pure)

Abstract

Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speech-based depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker’s average weighted variance - a GMM-based indicator of speaker variability - and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.
Original languageEnglish
Title of host publication14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies
EditorsFrederic Bimbot, Cecile Fougeron, Francois Pellegrino
Place of PublicationLyon, France
PublisherInternational Speech Communication Association
Pages857-861
Number of pages5
Volume2
ISBN (Print)9781629934433
Publication statusPublished - 2013
Event14th Annual Conference of the International Speech Communication Association Interspeech 2013 - Lyon, Lyon, France
Duration: 25 Aug 201329 Aug 2013

Conference

Conference14th Annual Conference of the International Speech Communication Association Interspeech 2013
Abbreviated titleINTERSPEECH 2013
CountryFrance
CityLyon
Period25/08/1329/08/13

Fingerprint

modeling
speech
parameter
comparison
analysis
indicator

Cite this

Cummins, N., Epps, J., Sethu, V., Breakspear, M., & GOECKE, R. (2013). Modeling Spectral Variability for the Classification of Depressed Speech. In F. Bimbot, C. Fougeron, & F. Pellegrino (Eds.), 14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies (Vol. 2, pp. 857-861). Lyon, France: International Speech Communication Association.
Cummins, Nicholas ; Epps, Julien ; Sethu, Vidhyasaharan ; Breakspear, Michael ; GOECKE, Roland. / Modeling Spectral Variability for the Classification of Depressed Speech. 14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies. editor / Frederic Bimbot ; Cecile Fougeron ; Francois Pellegrino. Vol. 2 Lyon, France : International Speech Communication Association, 2013. pp. 857-861
@inproceedings{b501e5918e5a42669fab7e4253c51de5,
title = "Modeling Spectral Variability for the Classification of Depressed Speech",
abstract = "Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speech-based depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker’s average weighted variance - a GMM-based indicator of speaker variability - and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.",
keywords = "Depression analysis, Spectral Variability, MFCC",
author = "Nicholas Cummins and Julien Epps and Vidhyasaharan Sethu and Michael Breakspear and Roland GOECKE",
year = "2013",
language = "English",
isbn = "9781629934433",
volume = "2",
pages = "857--861",
editor = "Frederic Bimbot and Cecile Fougeron and Francois Pellegrino",
booktitle = "14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies",
publisher = "International Speech Communication Association",

}

Cummins, N, Epps, J, Sethu, V, Breakspear, M & GOECKE, R 2013, Modeling Spectral Variability for the Classification of Depressed Speech. in F Bimbot, C Fougeron & F Pellegrino (eds), 14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies. vol. 2, International Speech Communication Association, Lyon, France, pp. 857-861, 14th Annual Conference of the International Speech Communication Association Interspeech 2013, Lyon, France, 25/08/13.

Modeling Spectral Variability for the Classification of Depressed Speech. / Cummins, Nicholas; Epps, Julien; Sethu, Vidhyasaharan; Breakspear, Michael; GOECKE, Roland.

14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies. ed. / Frederic Bimbot; Cecile Fougeron; Francois Pellegrino. Vol. 2 Lyon, France : International Speech Communication Association, 2013. p. 857-861.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Modeling Spectral Variability for the Classification of Depressed Speech

AU - Cummins, Nicholas

AU - Epps, Julien

AU - Sethu, Vidhyasaharan

AU - Breakspear, Michael

AU - GOECKE, Roland

PY - 2013

Y1 - 2013

N2 - Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speech-based depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker’s average weighted variance - a GMM-based indicator of speaker variability - and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.

AB - Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speech-based depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker’s average weighted variance - a GMM-based indicator of speaker variability - and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.

KW - Depression analysis

KW - Spectral Variability

KW - MFCC

M3 - Conference contribution

SN - 9781629934433

VL - 2

SP - 857

EP - 861

BT - 14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies

A2 - Bimbot, Frederic

A2 - Fougeron, Cecile

A2 - Pellegrino, Francois

PB - International Speech Communication Association

CY - Lyon, France

ER -

Cummins N, Epps J, Sethu V, Breakspear M, GOECKE R. Modeling Spectral Variability for the Classification of Depressed Speech. In Bimbot F, Fougeron C, Pellegrino F, editors, 14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies. Vol. 2. Lyon, France: International Speech Communication Association. 2013. p. 857-861