Summary of the paper

Title The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic
Authors Amal Al-Saif and Katja Markert
Abstract We present the first effort towards producing an Arabic Discourse Treebank,anews corpus where all discourse connectives are identified and annotated withthe discourse relations they convey as well as with the two arguments theyrelate.We discuss our collection of Arabic discourse connectives as well asprinciples for identifying and annotating them in context, taking into accountproperties specific to Arabic. In particular, we deal with the fact that Arabichas a rich morphology: we therefore include clitics as connectives as well as awide range of nominalizations as potential arguments. We present a dedicateddiscourse annotation tool for Arabic and a large-scale annotation study. Weshow that both the human identification of discourse connectives and thedetermination of the discourse relations they convey is reliable. Our currentannotated corpus encompasses a final 5651 annotated discourse connectives in537 news texts. In future, we will release the annotated corpus to otherresearchers and use it for training and testing automated methods for discourseconnective and relation recognition.
Language Validation of LRs
Topics Corpus (creation, annotation, etc.), Discourse annotation, representation and processing, Validation of LRs
Full paper The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic
Bibtex @InProceedings{ALSAIF10.479,
  author = {Amal Al-Saif and Katja Markert},
  title = {The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA