Summary of the paper

Title Comment Extraction from Blog Posts and Its Applications to Opinion Mining
Authors Huan-An Kao and Hsin-Hsi Chen
Abstract Blog posts containing many personal experiences or perspectives toward specificsubjects are useful. Blogs allow readers to interact with bloggers by placingcomments on specific blog posts. The comments carry viewpoints of readerstoward the targets described in the post, or supportive/non-supportive attitudetoward the post. Comment extraction is challenging due to that there does notexist a unique template among all blog service providers. This paper proposesmethods to deal with this problem. Firstly, the repetitive patterns and theircorresponding blocks are extracted from input posts by pattern identificationalgorithm. Secondly, three filtering strategies, i.e., tag pattern loopfiltering, rule overlap filtering, and longest rule first, are used to removenon-comment blocks. Finally, a comment/non-comment classifier is learned todistinguish comment blocks from non-comment blocks with 14 block-level featuresand 5 rule-level features. In the experiments, we randomly select 600 blogposts from 12 blog service providers. F-measure, recall, and precision are0.801, 0.855, and 0.780, respectively, by using all of the three filteringstrategies together with some selected features. The application of commentextraction to blog mining is also illustrated. We show how to identify therelevant opinionated objects ― say, opinion holders, opinions, and targets,from posts.
Language Text mining
Topics Information Extraction, Information Retrieval, Document Classification, Text categorisation, Text mining
Full paper Comment Extraction from Blog Posts and Its Applications to Opinion Mining
Bibtex @InProceedings{KAO10.17,
  author = {Huan-An Kao and Hsin-Hsi Chen},
  title = {Comment Extraction from Blog Posts and Its Applications to Opinion Mining},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA