Summary of the paper

Title OAL: A NLP Architecture to Improve the Development of Linguistic Resources for NLP
Authors Javier Couto, Helena Blancafort, Somara Seng, Nicolas Kuchmann-Beauger, Anass Talby and Claude de Loupy
Abstract The performance of most NLP applications relies upon the quality of linguisticresources. The creation, maintenance and enrichment of those resources are alabour-intensive task, especially when no tools are available. In this paper wepresent the NLP architecture OAL, designed to assist computational linguists inthe whole process of the development of resources in an industrial context:from corpora compilation to quality assurance. To add new words more easily tothe morphosyntactic lexica, a guesser that lemmatizes and assignsmorphosyntactic tags as well as inflection paradigms to a new word has beendeveloped. Moreover, different control mechanisms are set up to check thecoherence and consistency of the resources. Today OAL manages resources in fiveEuropean languages: French, English, Spanish, Italian and Polish. Chinese andPortuguese are in process. The development of OAL has followed an incrementalstrategy. At present, semantic lexica, a named entities guesser and a namedentities phonetizer are being developed.
Language LR Infrastructures and Architectures
Topics Lexicon, lexical database, Tools, systems, applications, LR Infrastructures and Architectures
Full paper OAL: A NLP Architecture to Improve the Development of Linguistic Resources for NLP
Bibtex @InProceedings{COUTO10.882,
  author = {Javier Couto, Helena Blancafort, Somara Seng, Nicolas Kuchmann-Beauger, Anass Talby and Claude de Loupy},
  title = {OAL: A NLP Architecture to Improve the Development of Linguistic Resources for NLP},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA