Summary of the paper

Title Annotating the Enron Email Corpus with Number Senses
Authors Stuart Moore, Sabine Buchholz and Anna Korhonen
Abstract The Enron Email Corpus provides``Real World'' text in the business email domain,which is a target domain for many speech and language applications. We presenta section of this corpus annotatedwith number senses - labelling each number as a date,time, year, telephone number etc. We show that sense categories and theirfrequencies are very different in this domain than in newswire text. Theannotated corpus can provide valuable material for the developmentof number sense disambiguation techniques. We have released the annotations into the public domain, to allow otherresearchers to perform comparisons.
Language Other
Topics Corpus (creation, annotation, etc.), Word Sense Disambiguation, Other
Full paper Annotating the Enron Email Corpus with Number Senses
Bibtex @InProceedings{MOORE10.653,
  author = {Stuart Moore, Sabine Buchholz and Anna Korhonen},
  title = {Annotating the Enron Email Corpus with Number Senses},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA