MultiGED-2023 shared task at NLP4CALL : Multilingual Grammatical Error Detection

Publication type
C1
Publication status
Published
Authors
Volodina, E., Bryant, C., Caines, A., De Clercq, O., Frey, J., Ershova, E., Rosen, A., & Vinogradova, O.
Series
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning
Volume
197
Pagination
1-16
Publisher
LiU Electronic Press
Conference
12th Workshop on Natural Language Processing for Computer Assisted Langauge Learning (NLP4CALL 2023) (Tórshavn, Faroe Islands)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

This paper reports on the NLP4CALL shared task on Multilingual Grammatical Error Detection (MultiGED-2023), which included five languages: Czech, English, German, Italian and Swedish. It is the first shared task organized by the Computational SLA1 working group, whose aim is to promote less represented languages in the fields of Grammatical Error Detection and Correction, and other related fields. The MultiGED datasets have been produced based on second language (L2) learner corpora for each particular language. In this paper we introduce the task as a whole, elaborate on the dataset generation process and the design choices made to obtain MultiGED datasets, provide details of the evaluation metrics and CodaLab setup. We further briefly describe the systems used by participants and report the results.