Mandatory Fields

Authors

Dowling, M., Lynn, T and Way, A.

Conference Title

Social MT 2017 - First workshop on Social Media and User Generated Content Machine Translation

Title of Paper

A Crowd-sourcing Approach for Translations of Minority Language User-Generated Content

Year

2017

Month

May

Status

Published

Peer Reviewed

Times Cited

()

Optional Fields

Search Keyword

Editors

Start Page

End Page

Location

Prague, Czech Republic

Start Date

29-MAY-17

End Date

31-MAY-17

Abstract

Abstract Data sparsity is a common problem for machine translation of minority and less- resourced languages. While data collection for standard, grammatical text can be challenging enough, efforts for collection of parallel user-generated content can be even more challenging. In this paper we describe an approach to collecting English↔ Irish translations of user-generated content (tweets) that overcomes some of these hurdles. We show how a crowd-sourced data collection campaign, which was tailored to our target audience (the Irish language community), proved successful in gathering data for a niche domain. We also discuss the reliablity of crowdsourcing English↔ Irish tweet translations in terms of quality by reporting on a self-rating approach along with qualified reviewer ratings.

Funded By

URL

DOI Link

Grant Details

Funding Body

Grant Details