Recent educational policies have set out that Flemish education will implement large-scale government-lead standardized testing, including Dutch reading comprehension and writing. This study focuses on the possibilities of Automated Essay Scoring (AES) to consistently, fairly and practically assess a large number of writing products. AES systems automatically score a text using machine learning and by extracting linguistic characteristics from a text (Allen et al., 2016).
While research has shown AES can be used to assess writing, previous research has focused almost exclusively on one particular language (English) and genre (essays) mostly written for higher education purposes (Strobl et al., 2019). This exploratory study investigates the possibilities of AES to obtain reliable scoring for Dutch-speaking learners in the first stage of secondary education.
A corpus of 5,110 writing products of 2,613 pupils aged 13-14, based on six prompts, was holistically scored by 852 in-service and pre-service teachers, using pairwise comparison. This assessed corpus was used to train machine learning models and, as such, create a first AES system to assess Dutch writing products of learners in the first stage of secondary education. We experimented with two flavours of machine learning: a traditional feature-based approach and a deep learning one. For the first approach, all writing was processed with T-Scan (Pander Maat et al., 2014) to derive linguistic text characteristics including lexical and syntactic measures. The second deep learning approach relies on Dutch state-of-the-art pre-trained language models (Delobelle et al., 2020), which have been fine-tuned on the task of AES.
The predictions of the AES system will be discussed in relation to previous studies that have investigated the role of AES in assessing writing in education, as well as implications for the use of AES in the context of language testing, and zoom in on challenges? when working with data of young learners.