Summary of the paper

Title Creation of Lexical Resources for a Characterisation of Multiword Expressions in Italian
Authors Andrea Zaninello and Malvina Nissim
Abstract The theoretical characterisation of multiword expressions (MWEs) is tightlyconnected to their actual occurrences in data and to their representation inlexical resources. We present three lexical resources for Italian MWEs, namelyan electronic lexicon, a series of example corpora and a database of MWEsrepresented around morphosyntactic patterns. These resources are matchedagainst, and created from, a very large web-derived corpus for Italian thatspans across registers and domains. We can thus test expressions coded bylexicographers in a dictionary, thereby discarding unattested expressions,revisiting lexicographers's choices on the basis of frequency information, andat the same time creating an example sub-corpus for each entry. We organiseMWEs on the basis of the morphosyntactic information obtained from the data inan electronic, flexible knowledge-base containing structured annotationexploitable for multiple purposes. We also suggest further work directionstowards characterising MWEs by analysing the data organised in our databasethrough lexico-semantic information available in WordNet or MultiWordNet-likeresources, also in the perspective of expanding their set through theextraction of other similar compact expressions.
Language Validation of LRs
Topics MultiWord Expressions & Collocations, Lexicon, lexical database, Validation of LRs
Full paper Creation of Lexical Resources for a Characterisation of Multiword Expressions in Italian
Bibtex @InProceedings{ZANINELLO10.567,
  author = {Andrea Zaninello and Malvina Nissim},
  title = {Creation of Lexical Resources for a Characterisation of Multiword Expressions in Italian},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA