Abstract
This paper presents the Audio-Video Australian English Speech data corpus AVOZES. It contains recordings of 20 speakers uttering a variety of phrases. The corpus was designed for research on the statistical relationship of audio and video speech parameters with an audio-video (AV) automatic speech recognition (ASR) task in mind, but may be useful for other research tasks. AVOZES is the first published AV speaking-face data corpus for Australian English and is novel in its use of a stereo camera system for the video recordings and its modular design.
| Original language | English |
|---|---|
| Title of host publication | INTERSPEECH 2004 - ICSLP: 8th International Conference on Spoken Language Processing |
| Editors | S.H Kim, D.H Youn |
| Place of Publication | Canada |
| Publisher | ISCA |
| Pages | 2525-2528 |
| Number of pages | 4 |
| Publication status | Published - 2004 |
| Event | INTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing - Jeju, Korea, Republic of Duration: 3 Oct 2004 → 7 Oct 2004 |
Conference
| Conference | INTERSPEECH 2004 - ICSLP 8th International Conference on Spoken Language Processing |
|---|---|
| Country/Territory | Korea, Republic of |
| City | Jeju |
| Period | 3/10/04 → 7/10/04 |