Title |
Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar |
Authors |
Muhammad Kamran Malik, Tafseer Ahmed, Sebastian Sulger, Tina Bögel, Atif Gulzar, Ghulam Raza, Sarmad Hussain and Miriam Butt |
Abstract |
In this paper, we present a system for transliterating the Arabic-based scriptof Urdu to a Roman transliteration scheme. The system is integrated into alarger system consisting of a morphology module, implemented via finite statetechnologies, and a computational LFG grammar of Urdu that was developed withthe grammar development platform XLE (Crouch et al. 2008). Our long-term goalis to handle Hindi alongside Urdu; the two languages are very similar withrespect to syntax and lexicon and hence, one grammar can be used to cover bothlanguages. However, they are not similar concerning the script -- Hindi iswritten in Devanagari, while Urdu uses an Arabic-based script. By abstractingaway to a common Roman transliteration scheme in the respectivetransliterators, our system can be enabled to handle both languages inparallel. In this paper, we discuss the pipeline architecture of the Urdu-Romantransliterator, mention several linguistic and orthographic issues and presentthe integration of the transliterator into the LFG parsing system. |
Language |
Grammar and Syntax |
Topics |
LR Infrastructures and Architectures, Tools, systems, applications, Grammar and Syntax |
Full paper  |
Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar |
Bibtex |
@InProceedings{MALIK10.194,
author = {Muhammad Kamran Malik, Tafseer Ahmed, Sebastian Sulger, Tina Bögel, Atif Gulzar, Ghulam Raza, Sarmad Hussain and Miriam Butt}, title = {Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |