Summary of the paper

Title Hungarian Dependency Treebank
Authors Veronika Vincze, Dóra Szauter, Attila Almási, György Móra, Zoltán Alexin and János Csirik
Abstract Herein, we present the process of developing the first Hungarian DependencyTreeBank. First, short references are made to dependency grammars we consideredimportant in the development of our Treebank. Second, mention is made ofexisting dependency corpora for other languages. Third, we present the steps ofconverting the Szeged Treebank into dependency-tree format: from the originallyphrase-structured treebank, we produced dependency trees by automaticconversion, checked and corrected them thereby creating the first manuallyannotated dependency corpus for Hungarian. We also go into detail about the twomajor sets of problems, i.e. coordination and predicative nouns and adjectives.Fourth, we give statistics on the treebank: by now, we have completed theannotation of business news, newspaper articles, legal texts and texts ininformatics, at the same time, we are planning to convert the entire corpusinto dependency tree format. Finally, we give some hints on the applicabilityof the system: the present database may be utilized ― among others ― ininformation extraction and machine translation as well.
Language Information Extraction, Information Retrieval
Topics Corpus (creation, annotation, etc.), Grammar and Syntax, Information Extraction, Information Retrieval
Full paper Hungarian Dependency Treebank
Bibtex @InProceedings{VINCZE10.465,
  author = {Veronika Vincze, Dóra Szauter, Attila Almási, György Móra, Zoltán Alexin and János Csirik},
  title = {Hungarian Dependency Treebank},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA