LREC 2010 Proceedings

Summary of the paper

Title	The Creation of a Large-Scale LFG-Based Gold Parsebank
Authors	Alexis Baird and Christopher R. Walker
Abstract	Systems for syntactically parsing sentences have long been recognized as apriority in Natural Language Processing. Statistics-based systems requirelarge amounts of high quality syntactically parsed data. Using the XLE toolkitdeveloped at PARC and the LFG Parsebanker interface developed at Bergen, theParsebank Project at Powerset has generated a rapidly increasing volume ofsyntactically parsed data. By using these tools, we are able to leverage theLFG framework to provide richer analyses via both constituent (c-) andfunctional (f-) structures. Additionally, the Parsebanking Project usessource data from Wikipedia rather than source data limited to a specific genre,such as the Wall Street Journal. This paper outlines the process we used increating a large-scale LFG-Based Parsebank to address many of the shortcomingsof previously-created parse banks such as the Penn Treebank. While theParsebank corpus is still in progress, preliminary results using the data in avariety of contexts already show promise.
Language	Grammar and Syntax
Topics	Parsing, Corpus (creation, annotation, etc.), Grammar and Syntax
Full paper	The Creation of a Large-Scale LFG-Based Gold Parsebank
Bibtex	@InProceedings{BAIRD10.445, author = {Alexis Baird and Christopher R. Walker}, title = {The Creation of a Large-Scale LFG-Based Gold Parsebank}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }