Title |
Building a Domain-specific Document Collection for Evaluating Metadata Effects on Information Retrieval |
Authors |
Walid Magdy, Jinming Min, Johannes Leveling and Gareth J. F. Jones |
Abstract |
This paper describes the development of a structured document collectioncontaining user-generated text and numerical metadata for exploring theexploitation of metadata in information retrieval (IR). The collection consistsof more than 61,000 documents extracted from YouTube video pages on basketballin general and NBA (National Basketball Association) in particular, togetherwith a set of 40 topics and their relevance judgements. In addition, acollection of nearly 250,000 user profiles related to the NBA collection isavailable. Several baseline IR experiments report the effect of usingvideo-associated metadata on retrieval effectiveness. The results surprisinglyshow that searching the videos titles only performs significantly better thansearching additional metadata text fields of the videos such as the tags or thedescription. |
Language |
Multimedia Document Processing |
Topics |
Metadata, Information Extraction, Information Retrieval, Multimedia Document Processing |
Full paper  |
Building a Domain-specific Document Collection for Evaluating Metadata Effects on Information Retrieval |
Bibtex |
@InProceedings{MAGDY10.353,
author = {Walid Magdy, Jinming Min, Johannes Leveling and Gareth J. F. Jones}, title = {Building a Domain-specific Document Collection for Evaluating Metadata Effects on Information Retrieval}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |