A comparative study of recognition of speech using improved MFCC algorithms and rasta filters

Lavneet Singh, Girija Chetty

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    1 Citation (Scopus)

    Abstract

    Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)
    Original languageEnglish
    Title of host publicationCommunications in Computer and Information Science
    EditorsSumeet Dua, Aryya Gangopadhyay, Parimala Thulasiraman, Umberto Straccia, Michael Shepherd, Benno Stein
    Place of PublicationBerlin Heidelberg
    PublisherSpringer
    Pages304-314
    Number of pages11
    Volume285
    ISBN (Print)9783642291654
    DOIs
    Publication statusPublished - 2012
    Event6th International Conference on Information Systems, Technology and Management, ICISTM 2012 - Grenoble, Grenoble, France
    Duration: 28 Mar 201230 Mar 2012

    Conference

    Conference6th International Conference on Information Systems, Technology and Management, ICISTM 2012
    Abbreviated titleICISTM 2012
    CountryFrance
    CityGrenoble
    Period28/03/1230/03/12

    Fingerprint

    Speech recognition
    Parameterization
    Feature extraction
    Classifiers

    Cite this

    Singh, L., & Chetty, G. (2012). A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. In S. Dua, A. Gangopadhyay, P. Thulasiraman, U. Straccia, M. Shepherd, & B. Stein (Eds.), Communications in Computer and Information Science (Vol. 285, pp. 304-314). Berlin Heidelberg: Springer. https://doi.org/10.1007/978-3-642-29166-1_27
    Singh, Lavneet ; Chetty, Girija. / A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. Communications in Computer and Information Science. editor / Sumeet Dua ; Aryya Gangopadhyay ; Parimala Thulasiraman ; Umberto Straccia ; Michael Shepherd ; Benno Stein. Vol. 285 Berlin Heidelberg : Springer, 2012. pp. 304-314
    @inproceedings{c685803d111e4c76b264043e2e989d6d,
    title = "A comparative study of recognition of speech using improved MFCC algorithms and rasta filters",
    abstract = "Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)",
    keywords = "Automatic Speech Recognition, Mel frequency Cepstrum Coefficients, ERB Gammatone Filtering, Hidden Markov Model",
    author = "Lavneet Singh and Girija Chetty",
    year = "2012",
    doi = "10.1007/978-3-642-29166-1_27",
    language = "English",
    isbn = "9783642291654",
    volume = "285",
    pages = "304--314",
    editor = "Sumeet Dua and Aryya Gangopadhyay and Parimala Thulasiraman and Umberto Straccia and Michael Shepherd and Benno Stein",
    booktitle = "Communications in Computer and Information Science",
    publisher = "Springer",
    address = "Netherlands",

    }

    Singh, L & Chetty, G 2012, A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. in S Dua, A Gangopadhyay, P Thulasiraman, U Straccia, M Shepherd & B Stein (eds), Communications in Computer and Information Science. vol. 285, Springer, Berlin Heidelberg, pp. 304-314, 6th International Conference on Information Systems, Technology and Management, ICISTM 2012, Grenoble, France, 28/03/12. https://doi.org/10.1007/978-3-642-29166-1_27

    A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. / Singh, Lavneet; Chetty, Girija.

    Communications in Computer and Information Science. ed. / Sumeet Dua; Aryya Gangopadhyay; Parimala Thulasiraman; Umberto Straccia; Michael Shepherd; Benno Stein. Vol. 285 Berlin Heidelberg : Springer, 2012. p. 304-314.

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    TY - GEN

    T1 - A comparative study of recognition of speech using improved MFCC algorithms and rasta filters

    AU - Singh, Lavneet

    AU - Chetty, Girija

    PY - 2012

    Y1 - 2012

    N2 - Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)

    AB - Automatic Speech Recognition has been an active topic of research for the past four decades. The main objective of the automatic speech recognition task is to convert a speech segment into an interpretable text message without the need of human intervention. Many different algorithms and schemes based on different mathematical paradigms have been proposed in an attempt to improve recognition rates. Cepstral coefficients play an important part in speech theory and in automatic speech recognition in particular due to their ability to compactly represent relevant information that is contained in a short time sample of a continuous speech signal. The goal of this paper is to discuss comparison of speech parameterization methods: Mel-Frequency Cepstrum Coefficients (MFCC) and improved Mel-Frequency Cepstrum Coefficients (MFCC) using RASTA filters. Thus, in this study, we try to improve the MFCC algorithms to achieve much accuracy reducing the error rates in Automatic Speech Recognition. First, we remove signal correlation through normalization, then we use RASTA filter to filtering the cepstral coefficients. Finally, we reduce dimension of the cepstral coefficients by the variances of cepstral coefficients in different dimension and obtain our features. By using various classifiers, we try to simulate the speech feature extraction at much optimal and least error rate providing robust method for Automatic Speech Recognition (ASRs)

    KW - Automatic Speech Recognition

    KW - Mel frequency Cepstrum Coefficients

    KW - ERB Gammatone Filtering

    KW - Hidden Markov Model

    U2 - 10.1007/978-3-642-29166-1_27

    DO - 10.1007/978-3-642-29166-1_27

    M3 - Conference contribution

    SN - 9783642291654

    VL - 285

    SP - 304

    EP - 314

    BT - Communications in Computer and Information Science

    A2 - Dua, Sumeet

    A2 - Gangopadhyay, Aryya

    A2 - Thulasiraman, Parimala

    A2 - Straccia, Umberto

    A2 - Shepherd, Michael

    A2 - Stein, Benno

    PB - Springer

    CY - Berlin Heidelberg

    ER -

    Singh L, Chetty G. A comparative study of recognition of speech using improved MFCC algorithms and rasta filters. In Dua S, Gangopadhyay A, Thulasiraman P, Straccia U, Shepherd M, Stein B, editors, Communications in Computer and Information Science. Vol. 285. Berlin Heidelberg: Springer. 2012. p. 304-314 https://doi.org/10.1007/978-3-642-29166-1_27