How Vulnerable are Prosodic Features to Professional Imitators?

Mireia Farrus, Michael Wagner, Jan Anguita, Javier Hernando

Research output: A Conference proceeding or a Chapter in BookConference contribution

16 Citations (Scopus)

Abstract

Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5% between the target voices and the imitators’ natural voices to 22% between the target voices and the imitators’ impersonations
Original languageEnglish
Title of host publicationProceedings of Odyssey 2008
Subtitle of host publicationThe Speaker and Language Recognition Workshop
EditorsNico Brummer, Johann de Preez
Place of PublicationSouth Africa
PublisherInternational Speech Communication Association
Pages1-6
Number of pages6
Publication statusPublished - 2008
EventOdyssey 2008, The Speaker and Language Recognition Workshop - Stellenbosch, Stellenbosch, South Africa
Duration: 21 Jan 200824 Jan 2008

Publication series

NameOdyssey: The Speaker and Language Recognition Workshop
PublisherInternational Speech Communication Association

Conference

ConferenceOdyssey 2008, The Speaker and Language Recognition Workshop
CountrySouth Africa
CityStellenbosch
Period21/01/0824/01/08

Fingerprint

Security systems
Identification (control systems)
Experiments

Cite this

Farrus, M., Wagner, M., Anguita, J., & Hernando, J. (2008). How Vulnerable are Prosodic Features to Professional Imitators? In N. Brummer, & J. D. Preez (Eds.), Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop (pp. 1-6). (Odyssey: The Speaker and Language Recognition Workshop). South Africa: International Speech Communication Association.
Farrus, Mireia ; Wagner, Michael ; Anguita, Jan ; Hernando, Javier. / How Vulnerable are Prosodic Features to Professional Imitators?. Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop. editor / Nico Brummer ; Johann de Preez. South Africa : International Speech Communication Association, 2008. pp. 1-6 (Odyssey: The Speaker and Language Recognition Workshop).
@inproceedings{7ff1dc5ee204412d9068b044fa3fca40,
title = "How Vulnerable are Prosodic Features to Professional Imitators?",
abstract = "Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5{\%} between the target voices and the imitators’ natural voices to 22{\%} between the target voices and the imitators’ impersonations",
author = "Mireia Farrus and Michael Wagner and Jan Anguita and Javier Hernando",
year = "2008",
language = "English",
series = "Odyssey: The Speaker and Language Recognition Workshop",
publisher = "International Speech Communication Association",
pages = "1--6",
editor = "Nico Brummer and Preez, {Johann de}",
booktitle = "Proceedings of Odyssey 2008",

}

Farrus, M, Wagner, M, Anguita, J & Hernando, J 2008, How Vulnerable are Prosodic Features to Professional Imitators? in N Brummer & JD Preez (eds), Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop. Odyssey: The Speaker and Language Recognition Workshop, International Speech Communication Association, South Africa, pp. 1-6, Odyssey 2008, The Speaker and Language Recognition Workshop, Stellenbosch, South Africa, 21/01/08.

How Vulnerable are Prosodic Features to Professional Imitators? / Farrus, Mireia; Wagner, Michael; Anguita, Jan; Hernando, Javier.

Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop. ed. / Nico Brummer; Johann de Preez. South Africa : International Speech Communication Association, 2008. p. 1-6 (Odyssey: The Speaker and Language Recognition Workshop).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - How Vulnerable are Prosodic Features to Professional Imitators?

AU - Farrus, Mireia

AU - Wagner, Michael

AU - Anguita, Jan

AU - Hernando, Javier

PY - 2008

Y1 - 2008

N2 - Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5% between the target voices and the imitators’ natural voices to 22% between the target voices and the imitators’ impersonations

AB - Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5% between the target voices and the imitators’ natural voices to 22% between the target voices and the imitators’ impersonations

M3 - Conference contribution

T3 - Odyssey: The Speaker and Language Recognition Workshop

SP - 1

EP - 6

BT - Proceedings of Odyssey 2008

A2 - Brummer, Nico

A2 - Preez, Johann de

PB - International Speech Communication Association

CY - South Africa

ER -

Farrus M, Wagner M, Anguita J, Hernando J. How Vulnerable are Prosodic Features to Professional Imitators? In Brummer N, Preez JD, editors, Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop. South Africa: International Speech Communication Association. 2008. p. 1-6. (Odyssey: The Speaker and Language Recognition Workshop).