Summary of the paper

Title Hybrid Constituent and Dependency Parsing with Tsinghua Chinese Treebank
Authors Rui Wang and Yi Zhang
Abstract In this paper, we describe our hybrid parsing model on the Mandarin Chineseprocessing. In particular, we work on the Tsinghua Chinese Treebank (TCT),whose annotation has both constitutes and the head information of eachconstitute. The model we design combines the mainstream constitute parsing anddependency parsing. We present in detail 1) how to (partially) encode the headinformation into the constitute parsing, 2) how to encode constituteinformation into the dependency parsing, and 3) how to restore the headinformation using the dependency structure. For each of them, we take differentstrategies to deal with different cases. In an open shared task evaluation, weachieve an f1-score of 85.23% for the constitute parsing, 82.35% with partialhead information, and 74.27% with complete head information. The error analysisshows the challenge of restoring multiple-headed constitutes and also somepotentials to use the dependency structure to guide the constitute parsing,which will be our future work to explore.
Language
Topics Parsing, Evaluation methodologies
Full paper Hybrid Constituent and Dependency Parsing with Tsinghua Chinese Treebank
Bibtex @InProceedings{WANG10.844,
  author = {Rui Wang and Yi Zhang},
  title = {Hybrid Constituent and Dependency Parsing with Tsinghua Chinese Treebank},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA