The Audio-Video Australian English Speech Data Corpus AVOZES

Roland Goecke, J Millar

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    31 Citations (Scopus)

    Abstract

    This paper presents the Audio-Video Australian English Speech data corpus AVOZES. It contains recordings of 20 speakers uttering a variety of phrases. The corpus was designed for research on the statistical relationship of audio and video speech parameters with an audio-video (AV) automatic speech recognition (ASR) task in mind, but may be useful for other research tasks. AVOZES is the first published AV speaking-face data corpus for Australian English and is novel in its use of a stereo camera system for the video recordings and its modular design.
    Original languageEnglish
    Title of host publicationINTERSPEECH 2004 - ICSLP: 8th International Conference on Spoken Language Processing
    EditorsS.H Kim, D.H Youn
    Place of PublicationCanada
    PublisherISCA
    Pages2525-2528
    Number of pages4
    Publication statusPublished - 2004
    EventINTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing - Jeju, Korea, Republic of
    Duration: 3 Oct 20047 Oct 2004

    Conference

    ConferenceINTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing
    CountryKorea, Republic of
    CityJeju
    Period3/10/047/10/04

      Fingerprint

    Cite this

    Goecke, R., & Millar, J. (2004). The Audio-Video Australian English Speech Data Corpus AVOZES. In S. H. Kim, & D. H. Youn (Eds.), INTERSPEECH 2004 - ICSLP: 8th International Conference on Spoken Language Processing (pp. 2525-2528). Canada: ISCA.