Summary of the paper

Title Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
Authors Kais Dukes, Eric Atwell and Abdul-Baquee M. Sharaf
Abstract The Quranic Arabic Dependency Treebank (QADT) is part of the Quranic ArabicCorpus (http://corpus.quran.com), an online linguistic resource organized bythe University of Leeds, and developed through online collaborative annotation.The website has become a popular study resource for Arabic and the Quran, andis now used by over 1,500 researchers and students daily. This paper presentsthe treebank, explains the choice of syntactic representation, and highlightskey parts of the annotation guidelines. The text being analyzed is the Quran,the central religious book of Islam, written in classical Quranic Arabic (c.600 CE). To date, all 77,430 words of the Quran have a manually verifiedmorphological analysis, and syntactic analysis is in progress. 11,000 words ofQuranic Arabic have been syntactically annotated as part of a gold standardtreebank. Annotation guidelines are especially important to promote consistencyfor a corpus which is being developed through online collaboration, since oftenmany people will participate from different backgrounds and with differentlevels of linguistic expertise. The treebank is available online forcollaborative correction to improve accuracy, with suggestions reviewed byexpert Arabic linguists, and compared against existing published books ofQuranic Syntax.
Language Parsing
Topics Corpus (creation, annotation, etc.), Grammar and Syntax, Parsing
Full paper Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
Bibtex @InProceedings{DUKES10.278,
  author = {Kais Dukes, Eric Atwell and Abdul-Baquee M. Sharaf},
  title = {Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA