Summary of the paper

Title Improving Proper Name Recognition by Adding Automatically Learned Pronunciation Variants to the Lexicon
Authors Bert Réveil, Jean-Pierre Martens and Henk van den Heuvel
Abstract This paper deals with the task of large vocabulary proper name recognition. Inorder to accomodate a wide diversity of possible name pronunciations (due tonon-native name origins or speaker tongues) a multilingual acoustic model iscombined with a lexicon comprising 3 grapheme-to-phoneme (G2P) transcriptionsfrom G2P transcribers for 3 different languages) and up to 4 so-calledphoneme-to-phoneme (P2P) transcriptions. The latter are generated with (speakertongue, name source) specific P2P converters that try to transform a set ofbaseline name transcriptions into a pool of transcription variants that liecloser to the `true’ name pronunciations. The experimental results show thatthe generated P2P variants can be employed to improve name recognition, andthat the obtained accuracy is comparable to what is achieved with typical (TY)transcriptions (made by a human expert). Furthermore, it is demonstrated thatthe P2P conversion can best be instantiated from a baseline transcription inthe name source language, and that knowledge of the speaker tongue is animportant input as well for the P2P transcription process.
Language Lexicon, lexical database
Topics Speech Recognition/Understanding, Multilinguality, Lexicon, lexical database
Full paper Improving Proper Name Recognition by Adding Automatically Learned Pronunciation Variants to the Lexicon
Bibtex @InProceedings{RVEIL10.281,
  author = {Bert Réveil, Jean-Pierre Martens and Henk van den Heuvel},
  title = {Improving Proper Name Recognition by Adding Automatically Learned Pronunciation Variants to the Lexicon},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA