Conference Publication Details
Mandatory Fields
Sandipan Dandapat and Andy Way
CICLING 2016: 17th International Conference on Intelligent Text Processing and Computational Linguistics
Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information.
2016
April
Published
1
()
Optional Fields
Konya, Turkey
03-MAY-16
09-MAY-16
In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We use an on-line machine translation system and a separate word alignment phase to find the projection of each Hindi word into the translated English sentence. We estimate the cross-lingual features using an English named entity recognizer and the alignment information. We use these cross-lingual features in a support vector machine-based classifier. The use of cross-lingual features improves F1 score by 2.1 points absolute (2.9% relative) over a good-performing baseline model.
http://www.computing.dcu.ie/~away/PUBS/2016/cicling2016.pdf
Grant Details
Science Foundation Ireland (SFI)
13/RC/2106