Summary of the paper

Title Building High Quality Databases for Minority Languages such as Galician
Authors Francisco Campillo, Daniela Braga, Ana Belén Mourín, Carmen García-Mateo, Pedro Silva, Miguel Sales Dias and Francisco Méndez
Abstract This paper describes the result of a joint R&D project between MicrosoftPortugal and the Signal Theory Group of the University of Vigo (Spain), where aset of language resources was developed with application to Text―to―Speechsynthesis. First, a large Corpus of 10000 Galician sentences was designed andrecorded by a professional female speaker. Second, a lexicon with phonetic andgrammatical information of over 90000 entries was collected and reviewedmanually by a linguist expert. And finally, these resources were used for a MOS(Mean Opinion Score) perceptual test to compare two state―of―the―artspeech synthesizers of both groups, the one from Microsoft based on HMM, andthe one from the University of Vigo based on unit selection.
Language Evaluation methodologies
Topics Corpus (creation, annotation, etc.), Lexicon, lexical database, Evaluation methodologies
Full paper Building High Quality Databases for Minority Languages such as Galician
Bibtex @InProceedings{CAMPILLO10.790,
  author = {Francisco Campillo, Daniela Braga, Ana Belén Mourín, Carmen García-Mateo, Pedro Silva, Miguel Sales Dias and Francisco Méndez},
  title = {Building High Quality Databases for Minority Languages such as Galician},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA