Summary of the paper

Title Developing a Deep Linguistic Databank Supporting a Collection of Treebanks: the CINTIL DeepGramBank
Authors António Branco, Francisco Costa, João Silva, Sara Silveira, Sérgio Castro, Mariana Avelãs, Clara Pinto and João Graça
Abstract Corpora of sentences annotated with grammatical information have been deployedby extending the basic lexical and morphological data with increasingly complexinformation, such as phrase constituency, syntactic functions, semantic roles,etc. As these corpora grow in size and the linguistic information to be encodedreaches higher levels of sophistication, the utilization of annotation toolsand, above all, supporting computational grammars appear no longer as a matterof convenience but of necessity. In this paper, we report on the design features, the development conditions andthe methodological options of a deep linguistic databank, the CINTILDeepGramBank. In this corpus, sentences are annotated with fully fledgedlinguistically informed grammatical representations that are produced by a deeplinguistic processing grammar, thus consistently integrating morphological,syntactic and semantic information. We also report on how such corpus permits to straightforwardly obtain a wholerange of past generation annotated corpora (POS, NER and morphology), currentgeneration treebanks (constituency treebanks, dependency banks, propbanks) andnext generation databanks (logical form banks) simply by means of a veryresidual selection/extraction effort to get the appropriate "views" exposingthe relevant layers of information.
Language Semantics
Topics Corpus (creation, annotation, etc.), Grammar and Syntax, Semantics
Full paper Developing a Deep Linguistic Databank Supporting a Collection of Treebanks: the CINTIL DeepGramBank
Bibtex @InProceedings{BRANCO10.154,
  author = {António Branco, Francisco Costa, João Silva, Sara Silveira, Sérgio Castro, Mariana Avelãs, Clara Pinto and João Graça},
  title = {Developing a Deep Linguistic Databank Supporting a Collection of Treebanks: the CINTIL DeepGramBank},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA