Summary of the paper

Title Corpus and Evaluation Measures for Automatic Plagiarism Detection
Authors Alberto Barrón-Cedeño, Martin Potthast, Paolo Rosso and Benno Stein
Abstract The simple access to texts on digital libraries and the World Wide Web has ledto an increased number of plagiarism cases in recent years, which rendersmanualplagiarism detection infeasible at large. Various methods for automaticplagiarism detection have been developed whose objective is to assist humanexperts in the analysis of documents for plagiarism. The methods can be dividedinto two main approaches: intrinsic and external. Unlike other tasks in naturallanguage processing and information retrieval, it is not possible to publish acollection of real plagiarism cases for evaluation purposes since they cannotbe properly anonymized. Therefore, current evaluations found in the literatureare incomparable and, very often not even reproducible. Our contribution inthisrespect is a newly developed large-scale corpus of artificial plagiarism usefulfor the evaluation of intrinsic as well as external plagiarism detection.Additionally, new detection performance measures tailored to the evaluation ofplagiarism detection algorithms are proposed.
Language Evaluation methodologies
Topics Information Extraction, Information Retrieval, Corpus (creation, annotation, etc.), Evaluation methodologies
Full paper Corpus and Evaluation Measures for Automatic Plagiarism Detection
Bibtex @InProceedings{BARRNCEDEO10.35,
  author = {Alberto Barrón-Cedeño, Martin Potthast, Paolo Rosso and Benno Stein},
  title = {Corpus and Evaluation Measures for Automatic Plagiarism Detection},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA