Title |
Inter-sentential Relations in Information Extraction Corpora |
Authors |
Kumutha Swampillai and Mark Stevenson |
Abstract |
In natural language relationships between entities can asserted within a singlesentence or over many sentences in a document. Many information extractionsystems are constrained to extracting binary relations that are asserted withina singlesentence (single-sentence relations) and this limits the proportion ofrelations they can extract since those expressed across multiple sentences(inter-sentential relations) are not considered. The analysis in this paperfocuses on finding the distribution of inter-sentential and single-sentencerelations in two corpora usedfor the evaluation of Information Extraction systems: the MUC6 corpus and theACE corpus from 2003. In order to carry out this analysis we had to manuallymark up all the management succession relations described in the MUC6 corpus.It was found that inter-sentential relations constitute 28.5% and 9.4% of thetotal number of relations in MUC6 and ACE03 respectively. This places upperbounds on the recall of information extraction systems that do not considerrelations that are asserted across multiple sentences (71.5% and 90.6%respectively). |
Language |
Validation of LRs |
Topics |
Corpus (creation, annotation, etc.), Information Extraction, Information Retrieval, Validation of LRs |
Full paper  |
Inter-sentential Relations in Information Extraction Corpora |
Bibtex |
@InProceedings{SWAMPILLAI10.905,
author = {Kumutha Swampillai and Mark Stevenson}, title = {Inter-sentential Relations in Information Extraction Corpora}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |