Voice source waveforms for utterance level speaker identification using support vector machines

David VANDYKE, Michael Wagner, Roland Goecke

Research output: A Conference proceeding or a Chapter in BookConference contribution

4 Citations (Scopus)

Abstract

The voice source waveform generated by the periodic motion of the vocal folds during voiced speech remains to be fully utilised in automatic speaker recognition systems. We perform closed-set speaker identification experiments on the YOHO speech corpus with the aim of continuing our investigation into the level of speaker discriminatory information present in a data driven parameterisation of the voice-source waveform obtained by closed-phase inverse filtering. Discriminatory modelling using support-vector-machines resulted in utterance level correct identification rates of 85.3% when using a multi-class model, and 72.5% when using a binary, one-against-all regression model, each on cohorts of 20 speakers respectively. These results compare well with other speaker identification experiments in the literature employing features derived from the voice source waveform, and are positive when observed under the hypothesis that they should be complementary to the common magnitude spectral parameters (mel-cepstra)
Original languageEnglish
Title of host publication2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013
EditorsJane Labadin, Jacey-Lynn Minoi, Dayang NurFatimah Awang Iskandar, Azman Bujang
Place of PublicationMalaysia
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1-7
Number of pages7
ISBN (Print)9781479910915
DOIs
Publication statusPublished - 2013
Event8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle - Kuching, Kuching, Malaysia
Duration: 1 Jul 20134 Jul 2013

Conference

Conference8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle
CountryMalaysia
CityKuching
Period1/07/134/07/13

Fingerprint

Support vector machines
Parameterization
Experiments

Cite this

VANDYKE, D., Wagner, M., & Goecke, R. (2013). Voice source waveforms for utterance level speaker identification using support vector machines. In J. Labadin, J-L. Minoi, D. N. A. Iskandar, & A. Bujang (Eds.), 2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013 (pp. 1-7). Malaysia: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/CITA.2013.6637568
VANDYKE, David ; Wagner, Michael ; Goecke, Roland. / Voice source waveforms for utterance level speaker identification using support vector machines. 2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013. editor / Jane Labadin ; Jacey-Lynn Minoi ; Dayang NurFatimah Awang Iskandar ; Azman Bujang. Malaysia : IEEE, Institute of Electrical and Electronics Engineers, 2013. pp. 1-7
@inproceedings{8acbb154ebbf43b79e582558196b090a,
title = "Voice source waveforms for utterance level speaker identification using support vector machines",
abstract = "The voice source waveform generated by the periodic motion of the vocal folds during voiced speech remains to be fully utilised in automatic speaker recognition systems. We perform closed-set speaker identification experiments on the YOHO speech corpus with the aim of continuing our investigation into the level of speaker discriminatory information present in a data driven parameterisation of the voice-source waveform obtained by closed-phase inverse filtering. Discriminatory modelling using support-vector-machines resulted in utterance level correct identification rates of 85.3{\%} when using a multi-class model, and 72.5{\%} when using a binary, one-against-all regression model, each on cohorts of 20 speakers respectively. These results compare well with other speaker identification experiments in the literature employing features derived from the voice source waveform, and are positive when observed under the hypothesis that they should be complementary to the common magnitude spectral parameters (mel-cepstra)",
keywords = "Glottal Waveform, Speaker Identification, Voice Source",
author = "David VANDYKE and Michael Wagner and Roland Goecke",
year = "2013",
doi = "10.1109/CITA.2013.6637568",
language = "English",
isbn = "9781479910915",
pages = "1--7",
editor = "Jane Labadin and Jacey-Lynn Minoi and Iskandar, {Dayang NurFatimah Awang} and Azman Bujang",
booktitle = "2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States",

}

VANDYKE, D, Wagner, M & Goecke, R 2013, Voice source waveforms for utterance level speaker identification using support vector machines. in J Labadin, J-L Minoi, DNA Iskandar & A Bujang (eds), 2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013. IEEE, Institute of Electrical and Electronics Engineers, Malaysia, pp. 1-7, 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Kuching, Malaysia, 1/07/13. https://doi.org/10.1109/CITA.2013.6637568

Voice source waveforms for utterance level speaker identification using support vector machines. / VANDYKE, David; Wagner, Michael; Goecke, Roland.

2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013. ed. / Jane Labadin; Jacey-Lynn Minoi; Dayang NurFatimah Awang Iskandar; Azman Bujang. Malaysia : IEEE, Institute of Electrical and Electronics Engineers, 2013. p. 1-7.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Voice source waveforms for utterance level speaker identification using support vector machines

AU - VANDYKE, David

AU - Wagner, Michael

AU - Goecke, Roland

PY - 2013

Y1 - 2013

N2 - The voice source waveform generated by the periodic motion of the vocal folds during voiced speech remains to be fully utilised in automatic speaker recognition systems. We perform closed-set speaker identification experiments on the YOHO speech corpus with the aim of continuing our investigation into the level of speaker discriminatory information present in a data driven parameterisation of the voice-source waveform obtained by closed-phase inverse filtering. Discriminatory modelling using support-vector-machines resulted in utterance level correct identification rates of 85.3% when using a multi-class model, and 72.5% when using a binary, one-against-all regression model, each on cohorts of 20 speakers respectively. These results compare well with other speaker identification experiments in the literature employing features derived from the voice source waveform, and are positive when observed under the hypothesis that they should be complementary to the common magnitude spectral parameters (mel-cepstra)

AB - The voice source waveform generated by the periodic motion of the vocal folds during voiced speech remains to be fully utilised in automatic speaker recognition systems. We perform closed-set speaker identification experiments on the YOHO speech corpus with the aim of continuing our investigation into the level of speaker discriminatory information present in a data driven parameterisation of the voice-source waveform obtained by closed-phase inverse filtering. Discriminatory modelling using support-vector-machines resulted in utterance level correct identification rates of 85.3% when using a multi-class model, and 72.5% when using a binary, one-against-all regression model, each on cohorts of 20 speakers respectively. These results compare well with other speaker identification experiments in the literature employing features derived from the voice source waveform, and are positive when observed under the hypothesis that they should be complementary to the common magnitude spectral parameters (mel-cepstra)

KW - Glottal Waveform

KW - Speaker Identification

KW - Voice Source

U2 - 10.1109/CITA.2013.6637568

DO - 10.1109/CITA.2013.6637568

M3 - Conference contribution

SN - 9781479910915

SP - 1

EP - 7

BT - 2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013

A2 - Labadin, Jane

A2 - Minoi, Jacey-Lynn

A2 - Iskandar, Dayang NurFatimah Awang

A2 - Bujang, Azman

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - Malaysia

ER -

VANDYKE D, Wagner M, Goecke R. Voice source waveforms for utterance level speaker identification using support vector machines. In Labadin J, Minoi J-L, Iskandar DNA, Bujang A, editors, 2013 8th International Conference on Information Technology in Asia - Smart Devices Trend: Technologising Future Lifestyle, Proceedings of CITA 2013. Malaysia: IEEE, Institute of Electrical and Electronics Engineers. 2013. p. 1-7 https://doi.org/10.1109/CITA.2013.6637568