Stastical pattern recognition is the most successful approach to automatic speech and speaker recognition (ASASR). Of all the statistical pattern recognition techniques, the hidden Markov model (HMM) is the most important. The Gaussian mixture model (GMM) and vector quantisation (VQ) are also effective techniques, especially for speaker recognition and in conjunction with HMMs. for speech recognition. However, the performance of these techniques degrades rapidly in the context of insufficient training data and in the presence of noise or distortion. Fuzzy approaches with their adjustable parameters can reduce such degradation. Fuzzy set theory is one of the most, successful approaches in pattern recognition, where, based on the idea of a fuzzy membership function, fuzzy C'-means (FCM) clustering and noise clustering (NC) are the most, important techniques. To establish fuzzy approaches to ASASR, the following basic problems are solved. First, a time-dependent fuzzy membership function is defined for the HMM. Second, a general distance is proposed to obtain a relationship between modelling and clustering techniques. Third, fuzzy entropy (FE) clustering is proposed to relate fuzzy models to statistical models. Finally, fuzzy membership functions are proposed as discriminant functions in decison making. The following models are proposed: 1) the FE-HMM. NC-FE-HMM. FE-GMM. NC-FEGMM. FE-VQ and NC-FE-VQ in the FE approach. 2) the FCM-HMM. NC-FCM-HMM. FCM-GMM and NC-FCM-GMM in the FCM approach, and 3) the hard HMM and GMM as the special models of both FE and FCM approaches. Finally, a fuzzy approach to speaker verification and a further extension using possibility theory are also proposed. The evaluation experiments performed on the TI46,ANDOSL and YOHO corpora show better results for all of the proposed techniques in comparison with the non-fuzzy baseline techniques.
|Date of Award||2000|
|Supervisor||Michael Wagner (Supervisor) & Tu Van Le (Supervisor)|