Mandatory Fields

Authors

Peyman Passban, Qun Liu, Andy Way

Conference Title

COLING 2016

Title of Paper

Enriching Phrase Tables for Statistical Machine Translation Using Mixed Embeddings

Year

2016

Month

December

Status

Published

Peer Reviewed

Times Cited

()

Optional Fields

Search Keyword

Editors

Start Page

2582

End Page

2591

Location

Osaka, Japan

Start Date

11-DEC-16

End Date

17-DEC-16

Abstract

The phrase table is considered to be the main bilingual resource for the phrase-based statistical machine translation (PBSMT) model. During translation, a source sentence is decomposed into several phrases. The best match of each source phrase is selected among several target-side counterparts within the phrase table, and processed by the decoder to generate a sentence-level translation. The best match is chosen according to several factors, including a set of bilingual features. PBSMT engines by default provide four probability scores in phrase tables which are considered as the main set of bilingual features. Our goal is to enrich that set of features, as a better feature set should yield better translations. We propose new scores generated by a Convolutional Neural Network (CNN) which indicate the semantic relatedness of phrase pairs. We evaluate our model in different experimental settings with different language pairs. We observe significant improvements when the proposed features are incorporated into the PBSMT pipeline

Funded By

URL

http://www.aclweb.org/anthology/C16-1243

DOI Link

Grant Details

Funding Body

Science Foundation Ireland (SFI)

Grant Details

13/RC/2106