Summary of the paper

Title A Positional Tagset for Russian
Authors Jirka Hana and Anna Feldman
Abstract Fusional languages have rich inflection. As a consequence, tagsets capturingtheir morphological features are necessarily large. A naturalway to make a tagset manageable is to use a structured system. In this paper,we present a positional tagset for describing morphologicalproperties of Russian. The tagset was inspired by the Czech positional system(Hajic, 2004). We have used preliminary versions of thistagset in our previous work (e.g., Hana et al. (2004, 2006); Feldman (2006);Feldman and Hana (2010)). Here, we both systematize andextend these preliminary versions (by adding information about animacy, aspectand reflexivity); give a more detailed description of thetagset and provide comparison with the Czech system.Each tag of the tagset consists of 16 positions, each encoding onemorphological feature (part-of-speech, detailed part-of-speech, gender,animacy, number, case, possessor's gender and number, person, reflexivity,tense, aspect, degree of comparison, negation, voice, variant). The tagsetcontains approximately 2,000 tags.
Language Corpus (creation, annotation, etc.)
Topics Part of speech tagging, Morphology, Corpus (creation, annotation, etc.)
Full paper A Positional Tagset for Russian
Bibtex @InProceedings{HANA10.807,
  author = {Jirka Hana and Anna Feldman},
  title = {A Positional Tagset for Russian},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA