Summary of the paper

Title Evaluating Distributional Properties of Tagsets
Authors Markus Dickinson and Charles Jochim
Abstract We investigate which distributional properties should be present in a tagset byexamining different mappings of various current part-of-speech tagsets, lookingat English, German, and Italian corpora. Given the importance ofdistributional information, we present a simple model for evaluating how atagset mapping captures distribution, specifically by utilizing a notion offrames to capture the local context. In addition to an accuracy metriccapturing the internal quality of a tagset, we introduce a way to evaluate theexternal quality of tagset mappings so that we can ensure that the mappingretains linguistically important information from the original tagset. Although most of the mappings we evaluate are motivated by linguistic concerns,we also explore an automatic, bottom-up way to define mappings, to illustratethat better distributional mappings are possible. Comparing our initialevaluations to POS tagging results, we find that more distributional tagsetscan sometimes result in worse accuracy, underscring the need to carefullydefine the properties of a tagset.
Language Grammar and Syntax
Topics Part of speech tagging, Evaluation methodologies, Grammar and Syntax
Full paper Evaluating Distributional Properties of Tagsets
Bibtex @InProceedings{DICKINSON10.227,
  author = {Markus Dickinson and Charles Jochim},
  title = {Evaluating Distributional Properties of Tagsets},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA