Mandatory Fields

Authors

Peyman Passban, Qun Liu, Andy Way

Year

2017

Month

June

Journal

PRAGUE BULLETIN OF MATHEMATICAL LINGUISTICS

Title

Providing Morphological Information for SMT Using Neural Networks

Status

Published

Times Cited

()

Optional Fields

Search Keyword

Volume

108

Issue

Start Page

271

End Page

282

Abstract

Treating morphologically complex words (MCWs) as atomic units in translation would not yield a desirable result. Such words are complicated constituents with meaningful subunits. A complex word in a morphologically rich language (MRL) could be associated with a number of words or even a full sentence in a simpler language, which means the surface form of complex words should be accompanied with auxiliary morphological information in order to provide a precise translation and a better alignment. In this paper we follow this idea and propose two different methods to convey such information for statistical machine translation (SMT) models. In the first model we enrich factored SMT engines by introducing a new morphological factor which relies on subword-aware word embeddings. In the second model we focus on the language-modeling component. We explore a subword-level neural language model (NLM) to capture sequence-, word- and subword-level dependencies. Our NLM is able to approximate better scores for conditional word probabilities, so the decoder generates more fluent translations. We studied two languages Farsi and German in our experiments and observed significant improvements for both of them.

Publisher Location

Czech Republic

ISBN / ISSN

Edition

URL

https://www.degruyter.com/view/j/pralin.2017.108.issue-1/pralin-2017-0026/pralin-2017-0026.xml

DOI Link

10.1515/pralin-2017-0026

Grant Details

Funding Body

Grant Details