MultiLing EN-NL

Lieve Macken and Bram Vanroy

About MultiLing EN-NL

ENDU20: Ten Dutch translations of the multiLing set. Due to COVID restrictions only keyboard logging data was collected with Translog-II. Participants were not allowed to make use of external resources such as dictionaries or Wikipedia. The participants were native Dutch speakers who had recently obtained a master’s degree in translation involving English into Dutch translation. Tokenization of the target text was manually corrected but the source text is kept as-is for comparability. The sentence and word alignments were manually corrected. Data collection started as part of Lise Verstraete's master thesis "Estimating translation difficulty based on readability scores, subjectivity evaluation and process data", and continued as part of the PhD project of Bram Vanroy (Vanroy, 2021).

ENDU20-MT: Two Dutch machine translation of the multiLing set by DeepL (P20) and Google Translate (P21). These translations were made in 2020. Tokenization of the target text was manually corrected but the source text is kept as-is for comparability. The sentence and word alignments were manually corrected. The translations were NOT checked for errors, hence this dataset should be used with caution. The data was collected during the PhD project of Bram Vanroy (Vanroy, 2021).

Download instructions

  1. Go to the TPR-DB website 
  2. Log in as user TPRDB with password tprdb
  3. After logging in, switch to the management view 
  4. Find the relevant study and download the corresponding data