Title |
Arabic Parsing Using Grammar Transforms |
Authors |
Lamia Tounsi and Josef van Genabith |
Abstract |
We investigate Arabic Context Free Grammar parsing with dependency annotationcomparing lexicalised and unlexicalised parsers. We study how morphosyntacticas well as function tag information percolation in the form of grammartransforms (Johnson, 1998, Kulick et al., 2006) affects the performance of aparser and helps dependency assignment. We focus on the three most frequentfunctional tags in the Arabic Penn Treebank: subjects, direct objects andpredicates . We merge these functional tags with their phrasal categories and(where appropriate) percolate case information to the non-terminal (POS)category to train the parsers. We then automatically enrich the output of theseparsers with full dependency information in order to annotate trees withLexical Functional Grammar (LFG) f-structure equations with producef-structures, i.e. attribute-value matrices approximating to basicpredicate-argument-adjunct structure representations. We present a series ofexperiments evaluating how well lexicalized, history-based, generative (Bikel)as well as latent variable PCFG (Berkeley) parsers cope with the enrichedArabic data. We measure quality and coverage of both the output trees and thegenerated LFG f-structures. We show that joint functional and morphologicalinformation percolation improves both the recovery of trees as well asdependency results in the form of LFG f-structures. |
Language |
Part of speech tagging |
Topics |
Parsing, Grammar and Syntax, Part of speech tagging |
Full paper  |
Arabic Parsing Using Grammar Transforms |
Bibtex |
author = {Lamia Tounsi and Josef van Genabith}, title = {Arabic Parsing Using Grammar Transforms}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |