Summary of the paper

Title Fairy Tale Corpus Organization Using Latent Semantic Mapping and an Item-to-item Top-n Recommendation Algorithm
Authors Paula Vaz Lobo and David Martins de Matos
Abstract In this paper we present a fairy tale corpus that was semantically organizedand tagged. The proposed method uses latent semantic mapping to represent thestories and a top-n item-to-item recommendation algorithm to define clusters ofsimilar stories. Each story can be placed in more than one cluster and storiesin the same cluster are related to the same concepts. The results were manuallyevaluated regarding the groupings as perceived by human judges. The evaluationresulted in a precision of 0.81, a recall of 0.69, and an f-measure of 0.75when using tf*idf for word frequency. Our method is topic- andlanguage-independent, and, contrary to traditional clustering methods,automatically defines the number of clusters based on the set of documents.This method can be used as a setup for traditional clustering orclassification. The resulting corpus will be used for recommendation purposes,although it can also be used for emotion extraction, semantic role extraction,meaning extraction, text classification, among others.
Language Other
Topics Corpus (creation, annotation, etc.), Semantics, Other
Full paper Fairy Tale Corpus Organization Using Latent Semantic Mapping and an Item-to-item Top-n Recommendation Algorithm
Bibtex @InProceedings{VAZLOBO10.786,
  author = {Paula Vaz Lobo and David Martins de Matos},
  title = {Fairy Tale Corpus Organization Using Latent Semantic Mapping and an Item-to-item Top-n Recommendation Algorithm},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA