Exploring LLMs’ capabilities for error detection in Dutch L1 and L2 writing products

Publication type: A2
Publication status: Published
Authors: Kruijsbergen, J., Van Geertruyen, S, Hoste, V., & De Clercq, O.
Journal: COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL
Volume: 13
Pagination: 173-191
Download
View in Biblio

Abstract

This research examines the capabilities of Large Language Models for writing error detection, which can be seen as a first step towards automated writing support. Our work focuses on Dutch writing error detection, targeting two envisaged end-users: L1 and L2 adult speakers of Dutch. We relied on proprietary L1 and L2 datasets comprising writing products annotated with a variety of writing errors. Following the recent paradigms in NLP research, we experimented with both a fine-tuning approach combining different mono- (BERTje, RobBERT) and multilingual (mBERT, XLM-RoBERTa) models, as well as a zero-shot approach through prompting a generative autoregressive language model (GPT-3.5). The results reveal that the fine-tuning approach outperforms zero-shotting to a large extent, both for L1 and L2, even though there is much room left for improvement.

June 27, 2025	Workshop CALM Work Placements
June 12, 2025	LT3 at LTRC, ICTIC, NITS and DHBenelux
June 5, 2025	Podcast Episode Dwars Door de Klas
June 3, 2025	PhD Defense Margot 🎓
May 30, 2025	The road towards fine-tuned LLMs for lexicography