Summary of the paper

Title Spontal-N: A Corpus of Interactional Spoken Norwegian
Authors Rein Ove Sikveland, Anton Öttl, Ingunn Amdal, Mirjam Ernestus, Torbjørn Svendsen and Jens Edlund
Abstract Spontal-N is a corpus of spontaneous, interactional Norwegian. To ourknowledge, it is the first corpus of Norwegian in which the majority ofspeakers have spent significant parts of their lives in Sweden, and in whichthe recorded speech displays varying degrees of interference from Swedish. Thecorpus consists of studio quality audio- and video-recordings of four 30-minutefree conversations between acquaintances, and a manual orthographictranscription of the entire material. On basis of the orthographictranscriptions, we automatically annotated approximately 50 percent of thematerial on the phoneme level, by means of a forced alignment between theacoustic signal and pronunciations listed in a dictionary. Approximately sevenpercent of the automatic transcription was manually corrected. Taking themanual correction as a gold standard, we evaluated several sources ofpronunciation variants for the automatic transcription. Spontal-N is intendedas a general purpose speech resource that is also suitable for investigatingphonetic detail.
Language Discourse annotation, representation and processing
Topics Corpus (creation, annotation, etc.), Dialogue, Discourse annotation, representation and processing
Full paper Spontal-N: A Corpus of Interactional Spoken Norwegian
Bibtex @InProceedings{SIKVELAND10.314,
  author = {Rein Ove Sikveland, Anton Öttl, Ingunn Amdal, Mirjam Ernestus, Torbjørn Svendsen and Jens Edlund},
  title = {Spontal-N: A Corpus of Interactional Spoken Norwegian},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA