Audiovisual Speaker Identity Verification Based on Cross Modal Fusion

Girija Chetty, Michael Wagner

    Research output: Contribution to conference (non-published works)Paper

    Abstract

    In this paper, we propose the fusion of audio and explicit correlation features for speaker identity verification applications. Experiments performed with the GMM based speaker models with hybrid fusion technique involving late fusion of explicit cross-modal fusion features, with eigen lip and audio MFCC features allow a considerable improvement in EER performance An evaluation of the system performance
    with different gender specific datasets from controlled VidTIMIT data base and opportunistic UCBN database shows, that is possible to achieve an EER of less than 2% with correlated component hybrid fusion, and improvement of around 22 % over uncorrelated component fusion.
    Original languageEnglish
    Pages1-5
    Number of pages5
    Publication statusPublished - 2007
    EventInternational Conference on Audio-Visual Speech Processing - Hilvarenbeek, Netherlands
    Duration: 31 Aug 20073 Sep 2007

    Conference

    ConferenceInternational Conference on Audio-Visual Speech Processing
    CountryNetherlands
    CityHilvarenbeek
    Period31/08/073/09/07

    Fingerprint Dive into the research topics of 'Audiovisual Speaker Identity Verification Based on Cross Modal Fusion'. Together they form a unique fingerprint.

    Cite this