Conference Publication Details
Mandatory Fields
Srivastava A.;Ma Y.;Way A.
Proceedings of the 15th International Conference of the European Association for Machine Translation, EAMT 2011
Oracle-based training for phrase-based Statistical Machine Translation
2011
December
Published
1
()
Optional Fields
169
176
A Statistical Machine Translation (SMT) system generates an n-best list of candidate translations for each sentence. A model error occurs if the most probable translation (1-best) generated by the SMT decoder is not the most accurate as measured by its similarity to the human reference translation(s) (an oracle). In this paper we investigate the parametric differences between the 1-best and the oracle translation and attempt to try and close this gap by proposing two rescoring strategies to push the oracle up the n-best list. We observe modest improvements in METEOR scores over the baseline SMT system trained on French-English Europarl corpora. We present a detailed analysis of the oracle rankings to determine the source of model errors, which in turn has the potential to improve overall system performance. © 2011 European Association for Machine Translation.
Grant Details