Summary of the paper

Title A Pilot Arabic CCGbank
Authors Stephen A. Boxwell and Chris Brew
Abstract We describe a process for converting the Penn Arabic Treebank into the CCGformalism. Previous efforts have yielded CCGbanks in English, German, andTurkish, thus opening these languages to the sophisticated computational toolsdeveloped for CCG and enabling further cross-linguistic development. Conversionfrom a context free grammar treebank to a CCGbank is a four stage process: headfinding, argument classification, binarization, and category conversion. Inthe process of implementing a basic CCGbank conversion algorithm, we revealproperties of Arabic grammar that interfere with conversion, such as subjecttopicalization, genitive constructions, relative clauses, and optionalpronominal subjects. All of these problematic phenomena can be resolved in avariety of ways - we discuss advantages and disadvantages of each in theirrespective sections. We detail these and describe our categorial analysis ofeach of these Arabic grammatical phenomena in depth, as well as technical details on theirintegration into the conversion algorithm.
Language Parsing
Topics Corpus (creation, annotation, etc.), Grammar and Syntax, Parsing
Full paper A Pilot Arabic CCGbank
Bibtex @InProceedings{BOXWELL10.623,
  author = {Stephen A. Boxwell and Chris Brew},
  title = {A Pilot Arabic CCGbank},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA