Conference Publication Details
Mandatory Fields
Haithem Afli, Pintu Lohar and Andy Way
IJCNLP 2017 Workshop on Curation and Applications of Parallel and Comparable Corpora (Cupral 2017)
MultiNews: A Web collection of an Aligned Multimodal and Multilingual Corpus
2017
November
Published
1
()
Optional Fields
11
15
Taipei, Taiwan
27-NOV-17
01-DEC-17
Integrating Natural Language Processing (NLP) and computer vision is a promising effort. However, the applicability of these methods directly depends on the availability of a specific multimodal data that includes images and texts. In this paper, we present a collection of a Multimodal corpus of comparable document and their images in 9 languages from the web news articles of Euronews website.1 This corpus has found widespread use in the NLP community in Multilingual and multimodal tasks. Here, we focus on its acquisition of the images and text data and their multilingual alignment.
http://aclweb.org/anthology/W17-5602
Grant Details
Science Foundation Ireland (SFI)
13/RC/2106