Summary of the paper

Title WAPUSK20 - A Database for Robust Audiovisual Speech Recognition
Authors Alexander Vorwerk, Xiaohui Wang, Dorothea Kolossa, Steffen Zeiler and Reinhold Orglmeister
Abstract Audiovisual speech recognition (AVSR) systems have been proven superiorover audio-only speech recognizers in noisy environments by incorporatingfeatures of the visual modality. In order to develop reliable AVSR systems,appropriate simultaneously recorded speech and video data is needed.Inthis paper, we will introduce a corpus (WAPUSK20) that consists of audiovisualdata of 20 speakers uttering 100 sentences each with four channels of audio andastereoscopic video. The latter is intended to support more accurate liptrackingand the development of stereo data based normalization techniques for greaterrobustness of the recognition results. The sentence design has beenadopted from the GRID corpus that has been widely used for AVSRexperiments. Recordings have been made under acoustically realistic conditionsin a usual office room. Affordable hardware equipment has been used, such as apre-calibrated stereo camera and standard PC components. The software writtentocreate this corpus was designed in MATLAB with help of hardwarespecific software provided by the hardware manufacturers and freely availableopen source software.
Language Corpus (creation, annotation, etc.)
Topics Speech Recognition/Understanding, Speech resource/database, Corpus (creation, annotation, etc.)
Full paper WAPUSK20 - A Database for Robust Audiovisual Speech Recognition
Bibtex @InProceedings{VORWERK10.533,
  author = {Alexander Vorwerk, Xiaohui Wang, Dorothea Kolossa, Steffen Zeiler and Reinhold Orglmeister},
  title = {WAPUSK20 - A Database for Robust Audiovisual Speech Recognition},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA