Robustness of Prosodic Features to Voice Imitation

Mireia Farrus, Michael Wagner, Jan Anguita, Javier Hernando

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    5 Citations (Scopus)

    Abstract

    Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to what extent a mimicker is able to get close the prosodic characteristics of a target speaker. To this end, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. The results show that the identification error rate increases for all the features except F0 range when the impersonators' modified voices are used instead of the impersonators natural voices. Moreover, it seems easier to copy prosody on the basis of a whole sentence than for a specific word.
    Original languageEnglish
    Title of host publicationProceedings of Interspeech 2008
    Subtitle of host publicationincorporating SST 2008, 22-26 September 2008, Brisbane, Australia
    Editors Fletcher, Goecke, Burnham, Wagner
    Place of PublicationAustralia
    PublisherInternational Speech Communication Association
    Pages1-6
    Number of pages6
    ISBN (Print)9781615673780
    Publication statusPublished - 2008
    EventInterspeech 2008 - Brisbane, Australia
    Duration: 22 Sep 200826 Sep 2008

    Conference

    ConferenceInterspeech 2008
    CountryAustralia
    CityBrisbane
    Period22/09/0826/09/08

    Fingerprint

    Security systems
    Identification (control systems)
    Experiments

    Cite this

    Farrus, M., Wagner, M., Anguita, J., & Hernando, J. (2008). Robustness of Prosodic Features to Voice Imitation. In Fletcher, Goecke, Burnham, & Wagner (Eds.), Proceedings of Interspeech 2008: incorporating SST 2008, 22-26 September 2008, Brisbane, Australia (pp. 1-6). Australia: International Speech Communication Association.
    Farrus, Mireia ; Wagner, Michael ; Anguita, Jan ; Hernando, Javier. / Robustness of Prosodic Features to Voice Imitation. Proceedings of Interspeech 2008: incorporating SST 2008, 22-26 September 2008, Brisbane, Australia. editor / Fletcher ; Goecke ; Burnham ; Wagner. Australia : International Speech Communication Association, 2008. pp. 1-6
    @inproceedings{fdeb5d7dc8d249a0bc72f95175239524,
    title = "Robustness of Prosodic Features to Voice Imitation",
    abstract = "Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to what extent a mimicker is able to get close the prosodic characteristics of a target speaker. To this end, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. The results show that the identification error rate increases for all the features except F0 range when the impersonators' modified voices are used instead of the impersonators natural voices. Moreover, it seems easier to copy prosody on the basis of a whole sentence than for a specific word.",
    author = "Mireia Farrus and Michael Wagner and Jan Anguita and Javier Hernando",
    year = "2008",
    language = "English",
    isbn = "9781615673780",
    pages = "1--6",
    editor = "Fletcher and Goecke and Burnham and Wagner",
    booktitle = "Proceedings of Interspeech 2008",
    publisher = "International Speech Communication Association",

    }

    Farrus, M, Wagner, M, Anguita, J & Hernando, J 2008, Robustness of Prosodic Features to Voice Imitation. in Fletcher, Goecke, Burnham & Wagner (eds), Proceedings of Interspeech 2008: incorporating SST 2008, 22-26 September 2008, Brisbane, Australia. International Speech Communication Association, Australia, pp. 1-6, Interspeech 2008, Brisbane, Australia, 22/09/08.

    Robustness of Prosodic Features to Voice Imitation. / Farrus, Mireia; Wagner, Michael; Anguita, Jan; Hernando, Javier.

    Proceedings of Interspeech 2008: incorporating SST 2008, 22-26 September 2008, Brisbane, Australia. ed. / Fletcher; Goecke; Burnham; Wagner. Australia : International Speech Communication Association, 2008. p. 1-6.

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    TY - GEN

    T1 - Robustness of Prosodic Features to Voice Imitation

    AU - Farrus, Mireia

    AU - Wagner, Michael

    AU - Anguita, Jan

    AU - Hernando, Javier

    PY - 2008

    Y1 - 2008

    N2 - Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to what extent a mimicker is able to get close the prosodic characteristics of a target speaker. To this end, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. The results show that the identification error rate increases for all the features except F0 range when the impersonators' modified voices are used instead of the impersonators natural voices. Moreover, it seems easier to copy prosody on the basis of a whole sentence than for a specific word.

    AB - Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to what extent a mimicker is able to get close the prosodic characteristics of a target speaker. To this end, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. The results show that the identification error rate increases for all the features except F0 range when the impersonators' modified voices are used instead of the impersonators natural voices. Moreover, it seems easier to copy prosody on the basis of a whole sentence than for a specific word.

    M3 - Conference contribution

    SN - 9781615673780

    SP - 1

    EP - 6

    BT - Proceedings of Interspeech 2008

    A2 - Fletcher, null

    A2 - Goecke, null

    A2 - Burnham, null

    A2 - Wagner, null

    PB - International Speech Communication Association

    CY - Australia

    ER -

    Farrus M, Wagner M, Anguita J, Hernando J. Robustness of Prosodic Features to Voice Imitation. In Fletcher, Goecke, Burnham, Wagner, editors, Proceedings of Interspeech 2008: incorporating SST 2008, 22-26 September 2008, Brisbane, Australia. Australia: International Speech Communication Association. 2008. p. 1-6