Title |
Term and Collocation Extraction by Means of Complex Linguistic Web Services |
Authors |
Ulrich Heid, Fabienne Fritzinger, Erhard Hinrichs, Marie Hinrichs and Thomas Zastrow |
Abstract |
We present a web service-based environment for the use of linguistic resourcesand tools to address issues of terminology and language varieties. We discussthe architecture, corpus representation formats, components and a chainersupporting the combination of tools into task-specific services. Integratedinto this environment, single web services also become part of complexscenarios for web service use. Our web services take for example corpora ofseveral million words as an input on which they perform preprocessing, such astokenisation, tagging, lemmatisation and parsing, and corpus exploration, suchas collocation extraction and corpus comparison. Here we present an example onextraction of single and multiword items typical of a specific domain ortypical of a regional variety of German. We also give a critical review onneeds and available functions from a user's point of view. The work presentedhere is part of ongoing experimentation in the D-SPIN project, the Germannational counterpart of CLARIN. |
Language |
LR Infrastructures and Architectures |
Topics |
Lexicon, lexical database, MultiWord Expressions & Collocations, LR Infrastructures and Architectures |
Full paper  |
Term and Collocation Extraction by Means of Complex Linguistic Web Services |
Bibtex |
@InProceedings{HEID10.363,
author = {Ulrich Heid, Fabienne Fritzinger, Erhard Hinrichs, Marie Hinrichs and Thomas Zastrow}, title = {Term and Collocation Extraction by Means of Complex Linguistic Web Services}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |