A Modular Approach to Learning Dutch Co-reference Resolution

Publication type: C1
Publication status: Published
Authors: Hoste, V., & van den Bosch, A.
Editor: C. Johansson
Series: Proceedings from the first Bergen Workshop on Anaphora Resolution (WAR I)
Pagination: 51-57
Publisher: Cambridge Scholars Publishing
Download

Abstract

This paper presents the first machine learning approach to the resolution of co-referential relations between nominal constituents in Dutch. Based on the hypothesis that different types of information sources contribute to a correct resolution of different types (pronominal, proper noun and common noun) of co-referential links, we propose a modular approach in which a separate module is trained per NP type. We present a thorough comparison of two machine learning techniques, a lazy learner and an eager learning approach, trained on the modular tasks as well as on the undecomposed task. In addition, we show that by postprocessing the resulting co-reference chains by means of a string-edit distance correction mechanism, we can avoid some unlikely local chainings and thereby improve precision. Lacking comparative results for Dutch, we also report results on the English MUC-6 and MUC-7 data sets, which are widely used for evaluation.

June 8, 2026	20 years of LT3
May 31, 2026	PhD Defense Quanqi Du
May 20, 2026	📢 PhD Position
Dec. 17, 2025	On how GPT-4o, Gemini-2.5 and DeepSeek-R1 have been used in lexicography
Oct. 31, 2025	PhD Defense Sofie