LREC 2010 Proceedings

Summary of the paper

Title	Inferring Subcat Frames of Verbs in Urdu
Authors	Ghulam Raza
Abstract	This paper describes an approach for inferring syntactic frames of verbs inUrdu from an untagged corpus. Urdu, like many other South Asian languages, is afree word order and case-rich language. Separable lexical units mark differentconstituents for case in phrases and clauses and are called case clitics. Thereis not always a one to one correspondence between case clitic form and case,and case and grammatical function in Urdu. Case clitics, therefore, can notserve as direct clues for extracting the syntactic frames of verbs. So atwo-step approach has been implemented. In a first step, all case cliticcombinations for a verb are extracted and the unreliable ones are filtered outby applying the inferential statistics. In a second step, the information ofoccurrences of case clitic forms in different combinations as a whole and onindividual level is processed to infer all possible syntactic frames of theverb.
Language	Statistical and machine learning methods
Topics	Acquisition, Lexicon, lexical database, Statistical and machine learning methods
Full paper	Inferring Subcat Frames of Verbs in Urdu
Bibtex	@InProceedings{RAZA10.536, author = {Ghulam Raza}, title = {Inferring Subcat Frames of Verbs in Urdu}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} }