Summary of the paper

Title Dictionary and Monolingual Corpus-based Query Translation for Basque-English CLIR
Authors Xabier Saralegi and Maddalen Lopez de Lacalle
Abstract This paper deals with the main problems that arise in the query translationprocess in dictionary-based Cross-lingual Information Retrieval (CLIR):translation selection, presence of Out-Of-Vocabulary (OOV) terms andtranslation of Multi-Word Expressions (MWE). We analyse to what extent eachproblem affects the retrieval performance for the Basque-English pair oflanguages, and the improvement obtained when using parallel corpora freemethods to address them. To tackle the translation selection problem we providenovel extensions of an already existing monolingual target co-occurrence-basedmethod, the Out-Of Vocabulary terms are dealt with by means of a cognatedetection-based method and finally, for the Multi-Word Expression translationproblem, a naïve matching technique is applied. The error analysis showssignificant differences in the deterioration of the performance depending onthe problem, in terms of Mean Average Precision (MAP), the translationselection problem being the cause of most of the errors. Otherwise, theproposed combined strategy shows a good performance to tackle the threeabove-mentioned main problems.
Language Machine Translation, SpeechToSpeech Translation
Topics Information Extraction, Information Retrieval, Multilinguality, Machine Translation, SpeechToSpeech Translation
Full paper Dictionary and Monolingual Corpus-based Query Translation for Basque-English CLIR
Bibtex @InProceedings{SARALEGI10.63,
  author = {Xabier Saralegi and Maddalen Lopez de Lacalle},
  title = {Dictionary and Monolingual Corpus-based Query Translation for Basque-English CLIR},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA