LREC 2010 Proceedings

Summary of the paper

Title	English ― Oromo Machine Translation: An Experiment Using a Statistical Approach
Authors	Sisay Adugna and Andreas Eisele
Abstract	This paper deals with translation of English documents to Oromo usingstatistical methods. Whereas English is the lingua franca of onlineinformation, Oromo, despite its relative wide distribution within Ethiopia andneighbouring countries like Kenya and Somalia, is one of the most resourcescarce languages. The paper has two main goals: one is to test how far we cango with the available limited parallel corpus for the English ― Oromolanguage pair and the applicability of existing Statistical Machine Translation(SMT) systems on this language pair. The second goal is to analyze the outputof the system with the objective of identifying the challenges that need to betackled. Since the language is resource scarce as mentioned above, we cannotget as many parallel documents as we want for the experiment. However, using alimited corpus of 20,000 bilingual sentences and 163,000 monolingual sentences,translation accuracy in terms of BLEU Score of 17.74% was achieved.
Language	Multilinguality
Topics	Machine Translation, SpeechToSpeech Translation, Corpus (creation, annotation, etc.), Multilinguality
Full paper	English ― Oromo Machine Translation: An Experiment Using a Statistical Approach
Bibtex	@InProceedings{ADUGNA10.683, author = {Sisay Adugna and Andreas Eisele}, title = {English ― Oromo Machine Translation: An Experiment Using a Statistical Approach}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }