Posted on April 29, 2019
Two new journal publications by Arda Tezcan, Véronique Hoste and Lieve Macken have been published.
Estimating word-level quality of statistical machine translation output using monolingual information alone: In this paper we investigate the effectiveness of using monolingual information contained in the machine-translated text to estimate word-level quality of SMT output. Our results show that this method is effective for capturing all types of fluency errors at once. on the task of predicting post-editing effort, while solely relying on monolingual information, it achieves on-par results with the state-of-the-art quality estimation systems which use both bilingual and monolingual information.
Estimating post-editing time using a gold-standard set of machine translation errors: Can we use machine translation errors to predict post-editing effort accurately? Which error types are the best predictors of post-editing effort? In this study, we seek answers to these questions by using machine learning and feature selection techniques.