Title |
Multimodal Russian Corpus (MURCO): First Steps |
Authors |
Elena Grishina |
Abstract |
The paper introduces the Multimodal Russian Corpus (MURCO), which has beencreated in the framework of the Russian National Corpus (RNC). The MURCOprovides the users with the great amount of phonetic, orthoepic, intonationalinformation related to Russian. Moreover, the deeply annotated part of theMURCO contains the data concerning Russian gesticulation, speech act system,types of vocal gestures and interjections in Russian, and so on. The Corpus ison free access. The paper describes the main types of annotation and theinterface structure of the MURCO. The MURCO consists of two parts, the secondpart being the subset of the first: 1) the whole Corpus, which is annotatedfrom the lexical (lemmatization), morphological, semantic, accentological,metatextual, socioligical point of view (these types of annotation are standardfor the RNC), and also from the point of view of phonetics (the orthoepicannotation and the mark-up of accentological word structure), 2) the deeplyannotated MURCO, which is annotated in addition from the point of view ofgesticulation and speech act structure. |
Language |
Speech resource/database |
Topics |
Corpus (creation, annotation, etc.), Discourse annotation, representation and processing, Speech resource/database |
Full paper  |
Multimodal Russian Corpus (MURCO): First Steps |
Bibtex |
@InProceedings{GRISHINA10.143,
author = {Elena Grishina}, title = {Multimodal Russian Corpus (MURCO): First Steps}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |