Title |
A Person-Name Filter for Automatic Compilation of Bilingual Person-Name Lexicons |
Authors |
Satoshi Sato and Sayoko Kaide |
Abstract |
This paper proposes a simple and fast person-name filter, which playsan important role in automatic compilation of a large bilingualperson-name lexicon. This filter is based on pn_score, which is thesum of two component scores, the score of the first name and that ofthe last name. Each score is calculated from two term sets: one is adense set in which most of the members are person names; another is abaseline set that contains less person names. The pn_score takes oneof five values, {+2, +1, 0, -1, -2}, which correspond to strongpositive, positive, undecidable, negative, and strong negative,respectively. This pn_score can be easily extended to bilingualpn_score that takes one of nine values, by summing scores of twolanguages. Experimental results show that our method works well formonolingual person names in English and Japanese; the F-score of eachlanguage is 0.929 and 0.939, respectively. The performance of thebilingual person-name filter is better; the F-score is 0.955. |
Language |
Named Entity recognition |
Topics |
Tools, systems, applications, Lexicon, lexical database, Named Entity recognition |
Full paper  |
A Person-Name Filter for Automatic Compilation of Bilingual Person-Name Lexicons |
Bibtex |
@InProceedings{SATO10.343,
author = {Satoshi Sato and Sayoko Kaide}, title = {A Person-Name Filter for Automatic Compilation of Bilingual Person-Name Lexicons}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |