Summary of the paper

Title News Image Annotation on a Large Parallel Text-image Corpus
Authors Pierre Tirilly, Vincent Claveau and Patrick Gros
Abstract In this paper, we present a multimodal parallel text-image corpus, and proposean image annotation method that exploits the textual information associatedwith images. Our corpus contains news articles composed of a text, images andimage captions, and is significantly larger than the other news corporaproposed in image annotation papers (27,041 articles and 42,568 captionnedimages). In our experiments, we use the text of the articles as a textualinformation source to annotate images, and image captions as a groundtruth toevaluate our annotation algorithm. Our annotation method identifies relevantnamed entities in the texts, and associates them with high-level visualconcepts detected in the images (in this paper, faces and logos). The namedentities most suited to image annotation are selected using an unsupervisedscore based on their statistics, inspired from the weights used in informationretrieval. Our experiments show that, although it is very simple, ourannotation method achieves an acceptable accuracy on our real-world newscorpus.
Language Information Extraction, Information Retrieval
Topics Multimedia Document Processing, Corpus (creation, annotation, etc.), Information Extraction, Information Retrieval
Full paper News Image Annotation on a Large Parallel Text-image Corpus
Bibtex @InProceedings{TIRILLY10.772,
  author = {Pierre Tirilly, Vincent Claveau and Patrick Gros},
  title = {News Image Annotation on a Large Parallel Text-image Corpus},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA