Summary of the paper

Title Improved Statistical Measures to Assess Natural Language Parser Performance across Domains
Authors Barbara Plank
Abstract We examine the performance of three dependency parsing systems, in particular,their performance variation across Wikipedia domains. We assess theperformance variation of (i) Alpino, a deep grammar-based system coupled with astatistical disambiguation versus (ii) MST and Malt, two purely data-drivenstatistical dependency parsing systems. The question is how the performance ofeach parser correlates with simple statistical measures of the text (e.g.sentence length, unknown word rate, etc.). This would give us an idea of howsensitive the different systems are to domain shifts, i.e. which system is morein need for domain adaptation techniques. To this end, we extend thestatistical measures used by Zhang and Wang (2009) for English and evaluate thesystems on several Wikipedia domains by focusing on a freer word-orderlanguage, Dutch. The results confirm the general findings of Zhang and Wang(2009), i.e. different parsing systems have different sensitivity againstvarious statistical measure of the text, where the highest correlation toparsing accuracy was found for the measure we added, sentence perplexity.
Language Grammar and Syntax
Topics Parsing, Statistical and machine learning methods, Grammar and Syntax
Full paper Improved Statistical Measures to Assess Natural Language Parser Performance across Domains
Bibtex @InProceedings{PLANK10.801,
  author = {Barbara Plank},
  title = {Improved Statistical Measures to Assess Natural Language Parser Performance across Domains},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA