Title |
Determining the Origin and Structure of Person Names |
Authors |
Yu Fu, Feiyu Xu and Hans Uszkoreit |
Abstract |
This paper presents a novel system HENNA (Hybrid Person Name Analyzer) foridentifying language origin and analyzing linguistic structures of personnames. We conduct ME-based classification methods for the language originidentification and achieve very promising performance. We will show thatword-internal character sequences provide surprisingly strong evidence forpredicting the language origin of person names. Our approach is context-,language- and domain-independent and can thus be easily adapted to person namesin or from other languages. Furthermore, we provide a novel strategy to handleorigin ambiguities or multiple origins in a name. HENNA also provides a personname parser for the analysis of linguistic and knowledge structures of personnames. All the knowledge about a person name in HENNA is modelled in aperson-name ontology, including relationships between language origins,linguistic features and grammars of person names of a specific language andinterpretation of name elements. The approaches presented here are usefulextensions of the named entity recognition task. |
Language |
Information Extraction, Information Retrieval |
Topics |
Language Identification, Named Entity recognition, Information Extraction, Information Retrieval |
Full paper  |
Determining the Origin and Structure of Person Names |
Bibtex |
@InProceedings{FU10.763,
author = {Yu Fu, Feiyu Xu and Hans Uszkoreit}, title = {Determining the Origin and Structure of Person Names}, booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |