Title |
Learning to Mine Definitions from Slovene Structured and Unstructured Knowledge-Rich Resources |
Authors |
Darja Fišer, Senja Pollak and Špela Vintar |
Abstract |
The paper presents an innovative approach to extract Slovene definitioncandidates from domain-specific corpora using morphosyntactic patterns,automatic terminology recognition and semantic tagging with wordnet senses.First, a classification model was trained on examples from Slovene Wikipediawhich was then used to find well-formed definitions among the extractedcandidates. The results of the experiment are encouraging, with accuracyranging from 67% to 71%. The paper also addresses some drawbacks of theapproach and suggests ways to overcome them in future work. |
Language |
Text mining |
Topics |
Knowledge Discovery/Representation, Lexicon, lexical database, Text mining |
Full paper  |
Learning to Mine Definitions from Slovene Structured and Unstructured Knowledge-Rich Resources |
Bibtex |
@InProceedings{FIER10.141,
author = {Darja Fišer, Senja Pollak and Špela Vintar}, title = {Learning to Mine Definitions from Slovene Structured and Unstructured Knowledge-Rich Resources}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |