Summary of the paper

Title Romanian Zero Pronoun Distribution: A Comparative Study
Authors Claudiu Mihăilă, Iustina Ilisei and Diana Inkpen
Abstract Anaphora resolution is still a challenging research field in natural languageprocessing, lacking a algorithm that correctly resolves anaphoric pronouns.Anaphoric zero pronouns pose an even greater challenge, since this category isnotlexically realised. Thus, their resolution is conditioned by their prioridentification stage. This paper reports on the distribution of zero pronounsin Romanian in various genres: encyclopaedic, legal, literary, and news-wiretexts. For this purpose, the RoZP corpus has been created, containing almost50000 tokens and 800 zero pronouns which are manually annotated. Thedistribution patterns are compared across genres, and exceptional cases arepresented in order to facilitate the methodological process of developing afuture zero pronoun identification and resolution algorithm. The evaluationresults emphasise that zero pronouns appear frequently in Romanian, and theirdistribution depends largely on the genre. Additionally, possible features arerevealed for their identification, and a search scope for the antecedent hasbeen determined, increasing the chances of correct resolution.
Language
Topics Anaphora, Coreference, Corpus (creation, annotation, etc.)
Full paper Romanian Zero Pronoun Distribution: A Comparative Study
Bibtex @InProceedings{MIHIL10.851,
  author = {Claudiu Mihăilă, Iustina Ilisei and Diana Inkpen},
  title = {Romanian Zero Pronoun Distribution: A Comparative Study},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA