Summary of the paper

Title The German Reference Corpus DeReKo: A Primordial Sample for Linguistic Research
Authors Marc Kupietz, Cyril Belica, Holger Keibel and Andreas Witt
Abstract This paper describes DeReKo (Deutsches Referenzkorpus), the Archive ofGeneral Reference Corpora of Contemporary Written German at the Institut für Deutsche Sprache (IDS) in Mannheim, and the rationalebehind its development. We discuss its design, its legal background,how to access it, available metadata, linguistic annotation layers,underlying standards, ongoing developments, and aspects of using thearchive for empirical linguistic research. The focus of the paper ison the advantages of DeReKo's design as a primordial sample fromwhich virtual corpora can be drawn for the specific purposes ofindividual studies. Both concepts, primordial sample and virtual corpus are explained and illustrated in detail. Furthermore,we describe in more detail how DeReKo deals with the fact that allits texts are subject to third parties' intellectual property rights,and how it deals with the issue of replicability, which isparticularly challenging given DeReKo's dynamic growth and thepossibility to construct from it an open number of virtual corpora.
Language Standards for LRs
Topics Corpus (creation, annotation, etc.), LR Infrastructures and Architectures, Standards for LRs
Full paper The German Reference Corpus DeReKo: A Primordial Sample for Linguistic Research
Bibtex @InProceedings{KUPIETZ10.414,
  author = {Marc Kupietz, Cyril Belica, Holger Keibel and Andreas Witt},
  title = {The German Reference Corpus DeReKo: A Primordial Sample for Linguistic Research},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA