Mandatory Fields

Authors

Okita T.;Way A.

Conference Title

Proceedings of the 24th International Florida Artificial Intelligence Research Society, FLAIRS - 24

Title of Paper

Given bilingual terminology in statistical machine translation: MWE-sensitive word alignment and hierarchical Pitman-Yor process-based translation model smoothing

Year

2011

Month

September

Status

Published

Peer Reviewed

Times Cited

()

Optional Fields

Search Keyword

Editors

Start Page

269

End Page

274

Location

Start Date

End Date

Abstract

This paper considers a scenario when we are given almost perfect knowledge about bilingual terminology in terms of a test corpus in Statistical Machine Translation (SMT). When the given terminology is part of a training corpus, one natural strategy in SMT is to use the trained translation model ignoring the given terminology. Then, two questions arises here. 1) Can a word aligner capture the given terminology? This is since even if the terminology is in a training corpus, it is often the case that a resulted translation model may not include these terminology. 2) Are probabilities in a translation model correctly calculated? In order to answer these questions, we did experiment introducing a Multi-Word Expression-sensitive (MWE-sensitive) word aligner and a hierarchical Pitman-Yor process-based translation model smoothing. Using 200k JP-EN NTCIR corpus, our experimental results show that if we introduce an MWE-sensitive word aligner and a new translation model smoothing, the overall improvement was 1.35 BLEU point absolute and 6.0% relative compared to the case we do not introduce these two. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved.

Funded By

URL

DOI Link

Grant Details

Funding Body

Grant Details