Abstract
In this paper, we describe an approach for an animated speaking
face synthesis and its application in modeling impostor/replay
attack scenarios for face-voice based speaker verification
systems. The speaking face reported here learns the spatiotemporal
relationship between speech acoustics and MPEG4
compliant facial animation points. The influence of articulatory,
perceptual, and prosodic acoustic features along with auditory
context on prediction accuracy was examined. The results are
indicative of vulnerability of audiovisual identity verification
systems to impostor/replay attacks using synthetic faces. The
level of vulnerability depends on several factors, such as the
type of audiovisual features, the fusion techniques used for the
audio and video features and their relative robustness. Also, the
success of the synthetic impostor depends on the type of coarticulation
models and acoustic features used for the
audiovisual mapping of speaking face synthesis.
face synthesis and its application in modeling impostor/replay
attack scenarios for face-voice based speaker verification
systems. The speaking face reported here learns the spatiotemporal
relationship between speech acoustics and MPEG4
compliant facial animation points. The influence of articulatory,
perceptual, and prosodic acoustic features along with auditory
context on prediction accuracy was examined. The results are
indicative of vulnerability of audiovisual identity verification
systems to impostor/replay attacks using synthetic faces. The
level of vulnerability depends on several factors, such as the
type of audiovisual features, the fusion techniques used for the
audio and video features and their relative robustness. Also, the
success of the synthetic impostor depends on the type of coarticulation
models and acoustic features used for the
audiovisual mapping of speaking face synthesis.
Original language | English |
---|---|
Title of host publication | Proceedings of the 9th International Conference on Spoken Language Processing Interspeech 2006 - ICSLP |
Editors | Carnegie Mellon |
Place of Publication | Germany |
Publisher | International Speech Communication Association |
Pages | 513-516 |
Number of pages | 4 |
ISBN (Print) | 9781604234497 |
Publication status | Published - 2006 |
Event | 9th International Conference on Spoken Language Processing - Pittsburgh, United States Duration: 17 Sep 2006 → 21 Sep 2006 |
Conference
Conference | 9th International Conference on Spoken Language Processing |
---|---|
Country/Territory | United States |
City | Pittsburgh |
Period | 17/09/06 → 21/09/06 |