A comparative study of recognition of speech using improved MFCC algorithms and rasta filters

Lavneet Singh, Girija Chetty

Research output: A Conference proceeding or a Chapter in BookConference contribution

1 Citation (Scopus)

Abstract

Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)
Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
EditorsSumeet Dua, Aryya Gangopadhyay, Parimala Thulasiraman, Umberto Straccia, Michael Shepherd, Benno Stein
Place of PublicationBerlin Heidelberg
PublisherSpringer
Pages304-314
Number of pages11
Volume285
ISBN (Print)9783642291654
DOIs
Publication statusPublished - 2012
Event6th International Conference on Information Systems, Technology and Management, ICISTM 2012 - Grenoble, Grenoble, France
Duration: 28 Mar 201230 Mar 2012

Conference

Conference6th International Conference on Information Systems, Technology and Management, ICISTM 2012
Abbreviated titleICISTM 2012
CountryFrance
CityGrenoble
Period28/03/1230/03/12

Fingerprint

Speech recognition
Parameterization
Feature extraction
Classifiers

Cite this

Singh, L., & Chetty, G. (2012). A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. In S. Dua, A. Gangopadhyay, P. Thulasiraman, U. Straccia, M. Shepherd, & B. Stein (Eds.), Communications in Computer and Information Science (Vol. 285, pp. 304-314). Berlin Heidelberg: Springer. https://doi.org/10.1007/978-3-642-29166-1_27
Singh, Lavneet ; Chetty, Girija. / A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. Communications in Computer and Information Science. editor / Sumeet Dua ; Aryya Gangopadhyay ; Parimala Thulasiraman ; Umberto Straccia ; Michael Shepherd ; Benno Stein. Vol. 285 Berlin Heidelberg : Springer, 2012. pp. 304-314
@inproceedings{c685803d111e4c76b264043e2e989d6d,
title = "A comparative study of recognition of speech using improved MFCC algorithms and rasta filters",
abstract = "Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)",
keywords = "Automatic Speech Recognition, Mel frequency Cepstrum Coefficients, ERB Gammatone Filtering, Hidden Markov Model",
author = "Lavneet Singh and Girija Chetty",
year = "2012",
doi = "10.1007/978-3-642-29166-1_27",
language = "English",
isbn = "9783642291654",
volume = "285",
pages = "304--314",
editor = "Sumeet Dua and Aryya Gangopadhyay and Parimala Thulasiraman and Umberto Straccia and Michael Shepherd and Benno Stein",
booktitle = "Communications in Computer and Information Science",
publisher = "Springer",
address = "Netherlands",

}

Singh, L & Chetty, G 2012, A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. in S Dua, A Gangopadhyay, P Thulasiraman, U Straccia, M Shepherd & B Stein (eds), Communications in Computer and Information Science. vol. 285, Springer, Berlin Heidelberg, pp. 304-314, 6th International Conference on Information Systems, Technology and Management, ICISTM 2012, Grenoble, France, 28/03/12. https://doi.org/10.1007/978-3-642-29166-1_27

A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. / Singh, Lavneet; Chetty, Girija.

Communications in Computer and Information Science. ed. / Sumeet Dua; Aryya Gangopadhyay; Parimala Thulasiraman; Umberto Straccia; Michael Shepherd; Benno Stein. Vol. 285 Berlin Heidelberg : Springer, 2012. p. 304-314.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - A comparative study of recognition of speech using improved MFCC algorithms and rasta filters

AU - Singh, Lavneet

AU - Chetty, Girija

PY - 2012

Y1 - 2012

N2 - Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)

AB - Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)

KW - Automatic Speech Recognition

KW - Mel frequency Cepstrum Coefficients

KW - ERB Gammatone Filtering

KW - Hidden Markov Model

U2 - 10.1007/978-3-642-29166-1_27

DO - 10.1007/978-3-642-29166-1_27

M3 - Conference contribution

SN - 9783642291654

VL - 285

SP - 304

EP - 314

BT - Communications in Computer and Information Science

A2 - Dua, Sumeet

A2 - Gangopadhyay, Aryya

A2 - Thulasiraman, Parimala

A2 - Straccia, Umberto

A2 - Shepherd, Michael

A2 - Stein, Benno

PB - Springer

CY - Berlin Heidelberg

ER -

Singh L, Chetty G. A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. In Dua S, Gangopadhyay A, Thulasiraman P, Straccia U, Shepherd M, Stein B, editors, Communications in Computer and Information Science. Vol. 285. Berlin Heidelberg: Springer. 2012. p. 304-314 https://doi.org/10.1007/978-3-642-29166-1_27