Title |
Evaluating Complex Semantic Artifacts |
Authors |
Christopher R. Walker and Hannah Copperman |
Abstract |
Evaluating complex Natural Language Processing (NLP) systems can proveextremely difficult. In many cases, the best one can do is to evaluate thesesystems indirectly, by looking at the impact they have on the performance ofthe downstream use case. For complex end-to-end systems, these metrics are notalways enlightening, especially from the perspective of NLP failure analysis,as component interaction can obscure issues specific to the NLP technology. Wepresent an evaluation program for complex NLP systems designed to producemeaningful aggregate accuracy metrics with sufficient granularity to supportactive development by NLP specialists. Our goals were threefold: toproducereliable metrics, to produce useful metrics and to produce actionable data.Our use case is a graph-based Wikipedia search index. Since the evaluation of acomplex graph structure is beyond the conceptual grasp of a single human judge,the problem needs to be broken down. Slices of complex data reflective ofcoherent Decision Points provide a good framework for evaluation using humanjudges (Medero et al., 2006). For NL semantics, there really is no substitute. Leveraging Decision Points allows complex semantic artifacts to be tracked withjudge-driven evaluations that are accurate, timely and actionable. |
Language |
Corpus (creation, annotation, etc.) |
Topics |
Semantics, Information Extraction, Information Retrieval, Corpus (creation, annotation, etc.) |
Full paper  |
Evaluating Complex Semantic Artifacts |
Bibtex |
@InProceedings{WALKER10.441,
author = {Christopher R. Walker and Hannah Copperman}, title = {Evaluating Complex Semantic Artifacts}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |