Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech

Mostafa Shahin, Beena Ahmed, Jacqueline McKechnie, Kirrie Ballard, Ricardo Gutierrez-Osuna

Research output: A Conference proceeding or a Chapter in BookConference contribution

9 Citations (Scopus)

Abstract

This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

Original languageEnglish
Title of host publication15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014)
Subtitle of host publicationCelebrating the Diversity of Spoken Languages
EditorsH. Li, P. Ching
Place of PublicationBaixas, France
PublisherInternational Speech Communication Association
Pages1583-1587
Number of pages5
Volume1
ISBN (Print)9781634394352
Publication statusPublished - 2014
Externally publishedYes
Event15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
Duration: 14 Sep 201418 Sep 2014

Conference

Conference15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014
CountrySingapore
CitySingapore
Period14/09/1418/09/14

Fingerprint

Correctness
Acoustic Model
Probable
Therapy
Acoustics
Cover
Verify
Experiment
Speech
Hidden Markov Model
Childhood Apraxia of Speech
Experiments
Children
Phoneme
Conventional
Mispronunciations

Cite this

Shahin, M., Ahmed, B., McKechnie, J., Ballard, K., & Gutierrez-Osuna, R. (2014). Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. In H. Li, & P. Ching (Eds.), 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) : Celebrating the Diversity of Spoken Languages (Vol. 1, pp. 1583-1587). Baixas, France: International Speech Communication Association.
Shahin, Mostafa ; Ahmed, Beena ; McKechnie, Jacqueline ; Ballard, Kirrie ; Gutierrez-Osuna, Ricardo. / Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) : Celebrating the Diversity of Spoken Languages . editor / H. Li ; P. Ching. Vol. 1 Baixas, France : International Speech Communication Association, 2014. pp. 1583-1587
@inproceedings{206ff96f99ce42cfadd61ea0765f9012,
title = "Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech",
abstract = "This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85{\%} when used with disordered speech.",
keywords = "Automatic speech recognition, Computer aided pronunciation learning, Deep learning, Pronunciation verification, Speech therapy",
author = "Mostafa Shahin and Beena Ahmed and Jacqueline McKechnie and Kirrie Ballard and Ricardo Gutierrez-Osuna",
year = "2014",
language = "English",
isbn = "9781634394352",
volume = "1",
pages = "1583--1587",
editor = "H. Li and P. Ching",
booktitle = "15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014)",
publisher = "International Speech Communication Association",

}

Shahin, M, Ahmed, B, McKechnie, J, Ballard, K & Gutierrez-Osuna, R 2014, Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. in H Li & P Ching (eds), 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) : Celebrating the Diversity of Spoken Languages . vol. 1, International Speech Communication Association, Baixas, France, pp. 1583-1587, 15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014, Singapore, Singapore, 14/09/14.

Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. / Shahin, Mostafa; Ahmed, Beena; McKechnie, Jacqueline; Ballard, Kirrie; Gutierrez-Osuna, Ricardo.

15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) : Celebrating the Diversity of Spoken Languages . ed. / H. Li; P. Ching. Vol. 1 Baixas, France : International Speech Communication Association, 2014. p. 1583-1587.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech

AU - Shahin, Mostafa

AU - Ahmed, Beena

AU - McKechnie, Jacqueline

AU - Ballard, Kirrie

AU - Gutierrez-Osuna, Ricardo

PY - 2014

Y1 - 2014

N2 - This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

AB - This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

KW - Automatic speech recognition

KW - Computer aided pronunciation learning

KW - Deep learning

KW - Pronunciation verification

KW - Speech therapy

UR - http://www.scopus.com/inward/record.url?scp=84910091933&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781634394352

VL - 1

SP - 1583

EP - 1587

BT - 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014)

A2 - Li, H.

A2 - Ching, P.

PB - International Speech Communication Association

CY - Baixas, France

ER -

Shahin M, Ahmed B, McKechnie J, Ballard K, Gutierrez-Osuna R. Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech. In Li H, Ching P, editors, 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) : Celebrating the Diversity of Spoken Languages . Vol. 1. Baixas, France: International Speech Communication Association. 2014. p. 1583-1587