Summary of the paper

Title Hybrid Citation Extraction from Patents
Authors Olivier Galibert, Sophie Rosset, Xavier Tannier and Fanny Grandry
Abstract The Quaero project organized a set of evaluations of Named Entity recognitionsystems in 2009. One of the sub-tasks consists in extracting citations frompatents, i.e. references to other documents, either other patents or generalliterature from English-language patents. We present in this paper theparticipation of LIMSI in this evaluation, with a complete system descriptionand the evaluation results. The corpus shown that patent and non-patentcitations have a very different nature. We then separated references to otherpatents and to general literature papers and we created a hybrid system. Forpatent citations, the system used rule-based expert knowledge on the form ofregular expressions. The system for detecting non-patent citations, on theother hand, is purely stochastic (machine learning with CRF++). Then we mixedboth approaches to provide a single output. 4 teams participated to this taskand our system obtained the best results of this evaluation campaign, even ifthe difference between the first two systems is poorly significant.
Language Tools, systems, applications
Topics Named Entity recognition, Information Extraction, Information Retrieval, Tools, systems, applications
Full paper Hybrid Citation Extraction from Patents
Bibtex @InProceedings{GALIBERT10.81,
  author = {Olivier Galibert, Sophie Rosset, Xavier Tannier and Fanny Grandry},
  title = {Hybrid Citation Extraction from Patents},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA