LREC 2010 Proceedings

Summary of the paper

Title	Evaluating Human-Machine Conversation for Appropriateness
Authors	Nick Webb, David Benyon, Preben Hansen and Oil Mival
Abstract	Evaluation of complex, collaborative dialogue systems is a difficult task.Traditionally, developers have relied upon subjective feedback from the user,and parametrisation over observable metrics. However, both models place somereliance on the notion of a task; that is, the system is helping to userachieve some clearly defined goal, such as book a flight or complete a bankingtransaction. It is not clear that such metrics are as useful when dealing witha system that has a more complex task, or even no definable task at all, beyondmaintain and performing a collaborative dialogue. Working within the EU fundedCOMPANIONS program, we investigate the use of appropriateness as a measure ofconversation quality, the hypothesis being that good companions need to be goodconversational partners . We report initial work in the direction of annotatingdialogue for indicators of good conversation, including the annotation andcomparison of the output of two generations of the same dialogue system.
Language	Usability, user satisfaction
Topics	Dialogue, Evaluation methodologies, Usability, user satisfaction
Full paper	Evaluating Human-Machine Conversation for Appropriateness
Bibtex	@InProceedings{WEBB10.115, author = {Nick Webb, David Benyon, Preben Hansen and Oil Mival}, title = {Evaluating Human-Machine Conversation for Appropriateness}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }