Summary of the paper

Title Construction of a Benchmark Data Set for Cross-lingual Word Sense Disambiguation
Authors Els Lefever and Véronique Hoste
Abstract Given the recent trend to evaluate the performance of word sense disambiguationsystems in a more application-oriented set-up, we report on the construction ofa multilingual benchmark data set for cross-lingual word sense disambiguation.The data set was created for a lexical sample of 25 English nouns, for whichtranslations were retrieved in 5 languages, namely Dutch, German, French,Italian and Spanish. The corpus underlying the sense inventory was the paralleldata set Europarl. The gold standard sense inventory was based on the automaticword alignments of the parallel corpus, which were manually verified. Theresulting word alignments were used to perform a manual clustering of thetranslations over all languages in the parallel corpus. The inventory thenserved as input for the annotators of the sentences, who were asked to providea maximum of three contextually relevant translations per language for a givenfocus word. The data set was released in the framework of the SemEval-2010competition.
Language Corpus (creation, annotation, etc.)
Topics Word Sense Disambiguation, Multilinguality, Corpus (creation, annotation, etc.)
Full paper Construction of a Benchmark Data Set for Cross-lingual Word Sense Disambiguation
Bibtex @InProceedings{LEFEVER10.34,
  author = {Els Lefever and Véronique Hoste},
  title = {Construction of a Benchmark Data Set for Cross-lingual Word Sense Disambiguation},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA