Mandatory Fields

Authors

Srivastava A.;Ma Y.;Way A.

Conference Title

Proceedings of the 15th International Conference of the European Association for Machine Translation, EAMT 2011

Title of Paper

Oracle-based training for phrase-based Statistical Machine Translation

Year

2011

Month

December

Status

Published

Peer Reviewed

Times Cited

()

Optional Fields

Search Keyword

Editors

Start Page

169

End Page

176

Location

Start Date

End Date

Abstract

A Statistical Machine Translation (SMT) system generates an n-best list of candidate translations for each sentence. A model error occurs if the most probable translation (1-best) generated by the SMT decoder is not the most accurate as measured by its similarity to the human reference translation(s) (an oracle). In this paper we investigate the parametric differences between the 1-best and the oracle translation and attempt to try and close this gap by proposing two rescoring strategies to push the oracle up the n-best list. We observe modest improvements in METEOR scores over the baseline SMT system trained on French-English Europarl corpora. We present a detailed analysis of the oracle rankings to determine the source of model errors, which in turn has the potential to improve overall system performance. © 2011 European Association for Machine Translation.

Funded By

URL

DOI Link

Grant Details

Funding Body

Grant Details