Peer-Reviewed Journal Details
Mandatory Fields
Du J.;Way A.
2011
December
ACM Transactions on Asian Language Information Processing
Improved Chinese-english SMT with Chinese "DE construction classification and reordering
Published
()
Optional Fields
Chinese DE construction Dynamic probabilistic latent variable model Log-linear model Source-side reordering
10
4
Syntactic reordering on the source side has been demonstrated to be helpful and effective for handling different word orders between source and target languages in SMT. In this article, we focus on the Chinese (DE) construction which is flexible and ubiquitous in Chinese and has many different ways to be translated into English so that it is a major source of word order differences in terms of translation quality. This article carries out the Chinese "DE construction study for Chinese-English SMT in which we propose a new classifier model-discriminative latent variable model (DPLVM)-with new features to improve the classification accuracy and indirectly improve the translation quality compared to a log-linear classifier. The DE classifier is used to recognize DE structures in both training and test sentences of Chinese, and then perform word reordering to make the Chinese sentences better match the word order of English. In order to investigate the impact of the DE classification and reordering in the source side on different types of SMT systems (namely PB-SMT, hierarchical PB-SMT (HPB-SMT) as well as the syntax-based SMT (SAMT)), we conduct a series of experiments on NIST 2005 and 2008 test sets to verify the effectiveness of our proposed model. The experimental results show that the MT systems using the data reordered by our proposed model outperform the baseline systems by 3.01% and 4.03% relative points on the NIST 2005 test set, 4.64% and 4.62% relative points on the NIST 2008 test set in terms of BLEU score for PB-SMT and HPBSMT respectively. However, the DE classification method does not perform significantly well for SAMT. Additionally, we also conducted some experiments to evaluate our DE classification and reordering approach on the word alignment and phrase table in terms of these three types of SMT systems. © 2011 ACM.
1530-0226
10.1145/2025384.2025385
Grant Details