Summary of the paper

Title When CORDIAL Becomes Friendly: Endowing the CORDIAL Corpus with a Syntactic Annotation Layer
Authors Catarina Magro
Abstract This paper reports on the syntactic annotation of a previously compiled andtagged corpus of European Portuguese (EP) dialects ― The Syntax-orientedCorpus of Portuguese Dialects (CORDIAL-SIN). The parsed version of CORDIAL-SINis intended to be a more efficient resource for the purpose of studying dialectsyntax by allowing automated searches for various syntactic constructions ofinterest. To achieve this goal we adopted a rich annotation system (the UPenncorpora annotation system) which codifies syntactic information of highrelevance. The annotation produces tree representations, in form of labelledparenthesis, that are integrally searchable with CorpusSearch, a search enginefor parsed corpora (Randall, 2005-2007). The present paper focuses onCORDIAL-SIN annotation issues, namely it presents the general principles andguidelines of the adopted annotation system and describes the methodology forconstructing the parsed version of the corpus and for searching it (tools andprocedures). Last section addresses the question of how an annotation systemoriginally designed for Middle English can be adapted to meet the particularneeds of a Portuguese corpus of dialectal speech.
Language Information Extraction, Information Retrieval
Topics Corpus (creation, annotation, etc.), Parsing, Information Extraction, Information Retrieval
Full paper When CORDIAL Becomes Friendly: Endowing the CORDIAL Corpus with a Syntactic Annotation Layer
Bibtex @InProceedings{MAGRO10.738,
  author = {Catarina Magro},
  title = {When CORDIAL Becomes Friendly: Endowing the CORDIAL Corpus with a Syntactic Annotation Layer},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA