Summary of the paper

Title The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms
Authors Ruud Koolen and Emiel Krahmer
Abstract We present the D-TUNA corpus, which is the first semantically annotated corpusof referring expressions in Dutch. Its primary function is to evaluate andimprove the performance of REG algorithms. Such algorithms are computationalmodels that automatically generate referring expressions by computing how aspecific target can be identified to an addressee by distinguishing it from aset of distractor objects. We performed a large-scale production experiment, inwhich participants were asked to describe furniture items and people, andprovided all descriptions with semantic information regarding the target andthe distractor objects. Besides being useful for evaluating REG algorithms, thecorpus addresses several other research goals. Firstly, the corpus containsboth written and spoken referring expressions uttered in the direction of anaddressee, which enables systematic analyses of how modality (text or speech)influences the human production of referring expressions. Secondly, due to itscomparability with the English TUNA corpus, our Dutch corpus can be used toexplore the differences between Dutch and English speakers regarding theproduction of referring expressions.
Language Evaluation methodologies
Topics Corpus (creation, annotation, etc.), Natural Language Generation, Evaluation methodologies
Full paper The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms
Bibtex @InProceedings{KOOLEN10.251,
  author = {Ruud Koolen and Emiel Krahmer},
  title = {The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA