Abstract
This paper presents the Audio-Video Australian English Speech data corpus AVOZES. It contains recordings of 20 speakers uttering a variety of phrases. The corpus was designed for research on the statistical relationship of audio and video speech parameters with an audio-video (AV) automatic speech recognition (ASR) task in mind, but may be useful for other research tasks. AVOZES is the first published AV speaking-face data corpus for Australian English and is novel in its use of a stereo camera system for the video recordings and its modular design.
Original language | English |
---|---|
Title of host publication | INTERSPEECH 2004 - ICSLP: 8th International Conference on Spoken Language Processing |
Editors | S.H Kim, D.H Youn |
Place of Publication | Canada |
Publisher | ISCA |
Pages | 2525-2528 |
Number of pages | 4 |
Publication status | Published - 2004 |
Event | INTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing - Jeju, Korea, Republic of Duration: 3 Oct 2004 → 7 Oct 2004 |
Conference
Conference | INTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju |
Period | 3/10/04 → 7/10/04 |