LREC 2010 Proceedings

Summary of the paper

Title	IndoWordNet
Authors	Pushpak Bhattacharyya
Abstract	India is a multilingual country where machine translation and cross lingualsearch are highly relevant problems. These problems require large resources-like wordnets and lexicons- of high quality and coverage. Wordnets are lexicalstructures composed of synsets and semantic relations. Synsets are sets ofsynonyms. They are linked by semantic relations like hypernymy (is-a), meronymy(part-of), troponymy (manner-of) etc. IndoWordnet is a linked structure ofwordnets of major Indian languages from Indo-Aryan, Dravidian and Sino-Tibetanfamilies. These wordnets have been created by following the expansion approachfrom Hindi wordnet which was made available free for research in 2006. Sincethen a number of Indian languages have been creating their wordnets. In thispaper we discuss the methodology, coverage, important considerations andmultifarious benefits of IndoWordnet. Case studies are provided for Marathi,Sanskrit, Bodo and Telugu, to bring out the basic methodology of and challengesinvolved in the expansion approach. The guidelines the lexicographers followfor wordnet construction are enumerated. The difference between IndoWordnet andEuroWordnet also is discussed.
Language	Multilinguality
Topics	Lexicon, lexical database, LR Infrastructures and Architectures, Multilinguality
Full paper	IndoWordNet
Bibtex	@InProceedings{BHATTACHARYYA10.939, author = {Pushpak Bhattacharyya}, title = {IndoWordNet}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }