Summary of the paper

Title BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals
Authors Florian Schiel
Abstract A new type of language resource called 'BAStat' hasbeen released by the Bavarian Archive for Speech Signalsat Ludwig Maximilians Universitaet, Munich. Incontrast to primary resources like speech and text corpora BAStatcomprises statistical estimates based on a number ofprimary spoken language resources: first and second order occurrenceprobabilityof phones, syllables and words, duration statistics, probabilities ofpronunciationvariants of words and probabilities of context information. Unlikeother statistical speech resources BAStat is based solely on recordingsof conversational German and therefore models spoken language not text.The resource consists of a bundle of 7-bit ASCII tablesand matrices to maximize inter-operability between differentoperation systems and can be downloaded for free from the BAS web-site. Thiscontribution gives a detailed description about the empirical basis, thecontaineddata types, the format of the resulting statistical data, some interestinginterpretations of grand figures and a brief comparison tothe text-based statistical resource CELEX.
Language Language modelling
Topics Speech resource/database, Phonetic Databases, Phonology, Language modelling
Full paper BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals
Bibtex @InProceedings{SCHIEL10.277,
  author = {Florian Schiel},
  title = {BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA