A Robust Speaking Face Modelling Approach Based on Multilevel Fusion

Girija Chetty, Michael Wagner

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    Abstract

    In this paper, we propose a robust face modelling approach based on multilevel fusion of 3D face biometric information with audio and visual speech information for biometric identity verification applications. The proposed approach combines the information from three audio-video based modules, namely: audio, visual speech, and 3D face and performs tri-module fusion in an automatic, unsupervised and adaptive manner, by adapting to the local performance of each module. This is done by taking the output-score based reliability estimates (confidence measures) of each of the module into account. The module weightings are determined automatically such that the reliability measure of the combined scores is maximised. To test the robustness of the proposed approach, the audio and visual speech (mouth) modalities are degraded to emulate various levels of train/test mismatch; employing additive white Gaussian noise for the audio and JPEG compression for the video signals. The results show improved fusion performance for a range of tested levels of audio and video degradation, compared to the individual module performances. Experiments on a 3D stereovision database AVOZES show that, at severe levels of audio and video mismatch, the audio, mouth, 3D face, and tri-module (audio+mouth+3D face) fusion EERs were 42.9%, 32%, 15%, and 7.3% respectively for biometric speaker identity verification application
    Original languageEnglish
    Title of host publicationProceedings Digital Image Computing Techniques and Applications - 9th Biennial Conference of the Australian Pattern Recognition Society
    EditorsM Bottema, A Maeder, N Redding, A Van Den Hengel
    Place of PublicationUnited States
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages408-415
    Number of pages8
    ISBN (Print)9780769530673
    DOIs
    Publication statusPublished - 2007
    EventDigital Image Computing Techniques Digital Image Computing Techniques and Applications DICTA 2007 - Glenelg, Adelaide, Australia
    Duration: 3 Dec 20075 Dec 2007

    Conference

    ConferenceDigital Image Computing Techniques Digital Image Computing Techniques and Applications DICTA 2007
    Abbreviated titleDICTA 2007
    CountryAustralia
    CityAdelaide
    Period3/12/075/12/07

      Fingerprint

    Cite this

    Chetty, G., & Wagner, M. (2007). A Robust Speaking Face Modelling Approach Based on Multilevel Fusion. In M. Bottema, A. Maeder, N. Redding, & A. V. D. Hengel (Eds.), Proceedings Digital Image Computing Techniques and Applications - 9th Biennial Conference of the Australian Pattern Recognition Society (pp. 408-415). United States: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/DICTA.2007.4426826