Conference Publication Details
Mandatory Fields
Eva Vanmassenhove, Jinhua Du, Andy Way
ICLC8 - 8th International Contrastive Linguistics Conference
Phrase-Tables as a resource for Cross-Linguistic Studies: On the Role of Lexical Aspect for English-French Past Tense Translation
2017
May
Published
1
()
Optional Fields
Athens, Greece
25-MAY-17
28-MAY-17
The fields of Contrastive Linguistics (CL), Translation Studies (TS) and Machine Translation (MT) are closely related but rarely brought together in practice (Čulo, 2016). However, just as MT can benefit from contrastive translation studies, so too can MT offer insights into empirical linguistic investigations. In particular, Statistical Machine Translation (SMT: e.g. Koehn, 2003) phrase-tables can be exploited to derive generalized linguistic information. In this study, we examine the influence of lexical aspect on English simple past verbs translated into French past tenses (passé composé and imparfait). We compiled a list of 206 English verbs classified into three aspectual classes (stative, dynamic: telic or atelic) according to Wilmet’s taxonomy (1998). We trained two SMT systems with the Moses toolkit (Koehn et al, 2007) using the default settings: (1) trained on 1 million parallel English-French sentences of the Europarl corpus (Koehn et al, 2005), (2) trained on the News Commentary corpus. The translation probabilities of the simple past verbs were automatically extracted from the phrase-tables and added together, obtaining for every verb its probability to be translated into passé composé or imparfait. We observed that many verbs had a strong preference for one tense or another in both corpora. However, the use of the passé composé is more frequent in the Europarl corpus which caused some (atelic activity) verbs to shift preference from imparfait to passé composé. While the aspectual class of the verb did not seem to have a big influence on the use of the passé composé, a preference for imparfait seemed to be exclusively for verbs belonging to the stative or activity classes. In this study, we examined the interaction between lexical aspect and grammatical aspect when translating English simple past verbs into French past tenses (passé composé and imparfait). We compiled a list of 206 English verbs classified into three aspectual classes according to Wilmet’s taxonomy (1998). We trained two SMT systems with the Moses toolkit (Koehn et al, 2007): (1) trained on 1 million parallel sentences of the Europarl corpus (Koehn et al, 2005), (2) trained on the News Commentary corpus. The translation probabilities of the simple past verbs were extracted from the phrase-tables and added together, obtaining for every verb its probability to be translated into passé composé or imparfait. We observed that many verbs had a preference for one tense or another in both corpora. However, the use of the passé composé is more frequent in the News corpus causing some (atelic activity) verbs to shift preference from imparfait to passé composé. While the aspectual class of the verb did not seem to have an influence on the use of the passé composé, a preference for imparfait seemed to be exclusively for verbs belonging to the stative or activity classes.
Grant Details
Science Foundation Ireland (SFI)
13