Measuring comprehension and perception of neural machine translated texts : a pilot study

Publication type
C1
Publication status
Published
Authors
Macken, L., & Ghyselen, I.
Series
Translating and the computer 40 (TC40) : proceedings
Pagination
120-126
Publisher
Editions Tradulex (Geneva)
Conference
Translating and the Computer 40 (London)
Download
(.pdf)
Project
ArisToCAT
View in Biblio
(externe link)

Abstract

In this paper we compare the results of reading comprehension tests on both human translated and raw (unedited) machine translated texts. We selected three texts of the English Machine Translation Evaluation version (CREG-MT-eval) of the Corpus of Reading Comprehension Exercises (CREG), for which we produced three different translations: a manual translation and two automatic translations generated by two state-of-the-art neural machine translation engines, viz. DeepL and Google Translate. The experiment was conducted via a SurveyMonkey questionnaire, which 99 participants filled in. Participants were asked to read the translation very carefully after which they had to answer the comprehension questions without having access to the translated text. Apart from assessing comprehension, we posed additional questions to get information on the participants’ perception of the machine translations. The results show that 74% of the participants can tell whether a translation was produced by a human or a machine. Human translations received the best overall clarity scores, but the reading comprehension tests provided much less unequivocal results. The errors that bother readers most relate to grammar, sentence length, level of idiomaticity and incoherence.