Modeling Spectral Variability for the Classification of Depressed Speech

Nicholas Cummins, Julien Epps, Vidhyasaharan Sethu, Michael Breakspear, Roland GOECKE

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

49 Citations (Scopus)
2 Downloads (Pure)


Quantifying how the spectral content of speech relates to changes in mental state may be crucial in building an objective speech-based depression classification system with clinical utility. This paper investigates the hypothesis that important depression based information can be captured within the covariance structure of a Gaussian Mixture Model (GMM) of recorded speech. Significant negative correlations found between a speaker’s average weighted variance - a GMM-based indicator of speaker variability - and their level of depression support this hypothesis. Further evidence is provided by the comparison of classification accuracies from seven different GMM-UBM systems, each formed by varying different parameter combinations during MAP adaption. This analysis shows that variance-only adaptation either outperforms or matches the de facto standard mean-only adaptation when classifying both the presence and severity of depression. This result is perhaps the first of its kind seen in GMM-UBM speech classification.
Original languageEnglish
Title of host publication14th Annual Conference of the International Speech Communication Association (INTERSPEECH 2013): Speech in Life Sciences and Human Societies
EditorsFrederic Bimbot, Cecile Fougeron, Francois Pellegrino
Place of PublicationLyon, France
PublisherInternational Speech Communication Association
Number of pages5
ISBN (Print)9781629934433
Publication statusPublished - 2013
Event14th Annual Conference of the International Speech Communication Association Interspeech 2013 - Lyon, Lyon, France
Duration: 25 Aug 201329 Aug 2013


Conference14th Annual Conference of the International Speech Communication Association Interspeech 2013
Abbreviated titleINTERSPEECH 2013


Dive into the research topics of 'Modeling Spectral Variability for the Classification of Depressed Speech'. Together they form a unique fingerprint.

Cite this