Summary of the paper

Title Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging
Authors Kristina Vučković, Željko Agić and Marko Tadić
Abstract In this paper, we present the results of an experiment with utilizing astochastic morphosyntactic tagger as a pre-processing module ofa rule-based chunker and partial parser for Croatian in order to raise itsoverall chunking and partial parsing accuracy on Croatian texts.In order to conduct the experiment, we have manually chunked and partiallyparsed 459 sentences from the Croatia Weekly 100 kwnewspaper sub-corpus taken from the Croatian National Corpus, that werepreviously also morphosyntactically disambiguated andlemmatized. Due to the lack of resources of this type, these sentences weredesignated as a temporary chunking and partial parsing goldstandard for Croatian. We have then evaluated the chunker and partial parser inthree different scenarios: (1) chunking previouslymorphosyntactically untagged text, (2) chunking text that was tagged using thestochastic morphosyntactic tagger for Croatian and (3)chunking manually tagged text. The obtained F1-scores for the three scenarioswere, respectively, 0.874 (P: 0.825, R: 0.930), 0.891 (P:0.856, R: 0.928) and 0.914 (P: 0.904, R: 0.925). The paper provides thedescription of language resources and tools used in theexperiment, its setup and discussion of results and perspectives for futurework.
Language Grammar and Syntax
Topics Parsing, Part of speech tagging, Grammar and Syntax
Full paper Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging
Bibtex @InProceedings{VUKOVI10.834,
  author = {Kristina Vučković, Željko Agić and Marko Tadić},
  title = {Improving Chunking Accuracy on Croatian Texts by Morphosyntactic Tagging},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA