Conference Publication Details
Mandatory Fields
Dowling, M., Lynn, T and Way, A.
Social MT 2017 - First workshop on Social Media and User Generated Content Machine Translation
A Crowd-sourcing Approach for Translations of Minority Language User-Generated Content
2017
May
Published
1
()
Optional Fields
Prague, Czech Republic
29-MAY-17
31-MAY-17
Abstract Data sparsity is a common problem for machine translation of minority and less- resourced languages. While data collection for standard, grammatical text can be challenging enough, efforts for collection of parallel user-generated content can be even more challenging. In this paper we describe an approach to collecting English↔ Irish translations of user-generated content (tweets) that overcomes some of these hurdles. We show how a crowd-sourced data collection campaign, which was tailored to our target audience (the Irish language community), proved successful in gathering data for a niche domain. We also discuss the reliablity of crowdsourcing English↔ Irish tweet translations in terms of quality by reporting on a self-rating approach along with qualified reviewer ratings.
Grant Details