Title |
A Typology of Near-Identity Relations for Coreference (NIDENT) |
Authors |
Marta Recasens, Eduard Hovy and M. Antònia Martí |
Abstract |
The task of coreference resolution requires people or systems to decide whentwo referring expressions refer to the 'same' entity or event. In real text,this is often a difficult decision because identity is never adequatelydefined, leading to contradictory treatment of cases in previous work. Thispaper introduces the concept of 'near-identity', a middle ground categorybetween identity and non-identity, to handle such cases systematically. Wepresent a typology of Near-Identity Relations (NIDENT) that includes fifteentypes―grouped under four main families―that capture a wide range of ways inwhich (near-)coreference relations hold between discourse entities. We validatethe theoretical model by annotating a small sample of real data and showingthat inter-annotator agreement is high enough for stability (K=0.58, and up toK=0.65 and K=0.84 when leaving out one and two outliers, respectively). Thiswork enables subsequent creation of the first internally consistent languageresource of this type through larger annotation efforts. |
Language |
Discourse annotation, representation and processing |
Topics |
Anaphora, Coreference, Corpus (creation, annotation, etc.), Discourse annotation, representation and processing |
Full paper  |
A Typology of Near-Identity Relations for Coreference (NIDENT) |
Bibtex |
@InProceedings{RECASENS10.160,
author = {Marta Recasens, Eduard Hovy and M. Antònia Martí}, title = {A Typology of Near-Identity Relations for Coreference (NIDENT)}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |