Summary of the paper

Title Error Correction for Arabic Dictionary Lookup
Authors C. Anton Rytting, Paul Rodrigues, Tim Buckwalter, David Zajic, Bridget Hirsch, Jeff Carnes, Nathanael Lynn, Sarah Wayland, Chris Taylor, Jason White, Charles Blake III, Evelyn Browne, Corey Miller and Tristan Purvis
Abstract We describe a new Arabic spelling correction system which is intended for usewith electronic dictionary search by learners of Arabic. Unlike other spellingcorrection systems, this system does not depend on a corpus of attested studenterrors but on student- and teacher-generated ratings of confusable pairs ofphonemes or letters. Separate error modules for keyboard mistypings, phoneticconfusions, and dialectal confusions are combined to create a weightedfinite-state transducer that calculates the likelihood that an input stringcould correspond to each citation form in a dictionary of Iraqi Arabic. Results are ranked by the estimated likelihood that a citation form could bemisheard, mistyped, or mistranscribed for the input given by the user. Toevaluate the system, we developed a noisy-channel model trained on students’speech errors and use it to perturb citation forms from a dictionary. Wecompare our system to a baseline based on Levenshtein distance and find that,when evaluated on single-error queries, our system performs 28% better than thebaseline (overall MRR) and is twice as good at returning the correct dictionaryform as the top-ranked result. We believe this to be the firstspellingcorrection system designed for a spoken, colloquial dialect of Arabic.
Language Authoring tools, proofing
Topics Lexicon, lexical database, Information Extraction, Information Retrieval, Authoring tools, proofing
Full paper Error Correction for Arabic Dictionary Lookup
Bibtex @InProceedings{RYTTING10.440,
  author = {C. Anton Rytting, Paul Rodrigues, Tim Buckwalter, David Zajic, Bridget Hirsch, Jeff Carnes, Nathanael Lynn, Sarah Wayland, Chris Taylor, Jason White, Charles Blake III, Evelyn Browne, Corey Miller and Tristan Purvis},
  title = {Error Correction for Arabic Dictionary Lookup},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA