TY - JOUR
T1 - Tabby Talks
T2 - An automated tool for the assessment of childhood apraxia of speech
AU - Shahin, Mostafa
AU - Ahmed, Beena
AU - Parnandi, Avinash
AU - Karappa, Virendra
AU - McKechnie, Jacqueline
AU - Ballard, Kirrie J.
AU - Gutierrez-Osuna, Ricardo
PY - 2015/6/1
Y1 - 2015/6/1
N2 - Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed "Tabby Talks," a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.
AB - Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed "Tabby Talks," a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.
KW - Automatic speech recognition
KW - Computer aided pronunciation learning
KW - Pronunciation verification
KW - Prosody
KW - Speech therapy
UR - http://www.scopus.com/inward/record.url?scp=84928481301&partnerID=8YFLogxK
U2 - 10.1016/j.specom.2015.04.002
DO - 10.1016/j.specom.2015.04.002
M3 - Article
AN - SCOPUS:84928481301
SN - 0167-6393
VL - 70
SP - 49
EP - 64
JO - Speech Communication
JF - Speech Communication
ER -