Summary of the paper

Title Building a Node of the Accessible Language Technology Infrastructure
Authors Bartosz Broda, Michał Marcińczuk and Maciej Piasecki
Abstract A limited prototype of the CLARIN Language Technology Infrastructure (LTI) nodeis presented. The node prototype provides several types of web services forPolish. The functionality encompasses morpho-syntactic processing, shallowsemantic processing of corpus on the basis of the SuperMatrix system andplWordNet browsing.We take the prototype as the starting point for the discussion on requirementsthat must be fulfilled by the LTI. Some possible solutions are proposed forless frequently discussed problems, e.g. streaming processing of language dataon the remote processing node. We experimentally investigate how to tackle withseveral requirements from many discussed.Such aspects as processing large volumes of data, asynchronous mode ofprocessing and scalability of the architecture to large number of users gotespecial attention in the constructed prototype of the Web Service formorpho-syntactic processing of Polish called TaKIPI-WS(http://plwordnet.pwr.wroc.pl/clarin/ws/takipi/).TaKIPI-WS is a distributed system with a three-layer architecture, anasynchronous model of request handling and multi-agent-based processing.TaKIPI-WS consists of three layers: WS Interface, Database and Daemons.The role of the Database is to store and exchange data between the Interfaceand the Daemons.The Daemons (i.e. taggers) are responsible for executing the requests queued inthe database.Results of the performance tests are presented in the paper, too.
Language LR national/international projects, organizational/policy issues
Topics LR Infrastructures and Architectures, Metadata, LR national/international projects, organizational/policy issues
Full paper Building a Node of the Accessible Language Technology Infrastructure
Bibtex @InProceedings{BRODA10.690,
  author = {Bartosz Broda, Michał Marcińczuk and Maciej Piasecki},
  title = {Building a Node of the Accessible Language Technology Infrastructure},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA