LREC 2010 Proceedings

Summary of the paper

Title	English-Hindi Transliteration using Multiple Similarity Metrics
Authors	Niraj Aswani and Robert Gaizauskas
Abstract	In this paper, we present an approach to measure the transliteration similarityof English-Hindi word pairs. Our approach has two components. First we proposea bi-directional mapping between one or more characters in the Devanagariscript and one or more characters in the Roman script (pronounced as inEnglish). This allows a given Hindi word written in Devanagari to betransliterated into the Roman script and vice-versa. Second, we present analgorithm for computing a similarity measure that is a variant of Dice’scoefficient measure and the LCSR measure and which also takes into account theconstraints needed to match English-Hindi transliterated words. Finally, byevaluating various similarity metrics individually and together under amultiple measure agreement scenario, we show that it is possible to achieve a0.92 f-measure in identifying English-Hindi word pairs that aretransliterations. In order to assess the portability of our approach to othersimilar languages we adapt our system to the Gujarati language.
Language	Tools, systems, applications
Topics	Phonetic Databases, Phonology, Machine Translation, SpeechToSpeech Translation, Tools, systems, applications
Full paper	English-Hindi Transliteration using Multiple Similarity Metrics
Bibtex	@InProceedings{ASWANI10.694, author = {Niraj Aswani and Robert Gaizauskas}, title = {English-Hindi Transliteration using Multiple Similarity Metrics}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }