A Modular Approach to Learning Dutch Co-reference Resolution

Publication type
C1
Publication status
Published
Authors
Hoste, V., & van den Bosch, A.
Editor
C. Johansson
Series
Proceedings from the first Bergen Workshop on Anaphora Resolution (WAR I)
Pagination
51-57
Publisher
Cambridge Scholars Publishing
Download
(.pdf)

Abstract

This paper presents the first machine learning approach to the resolution of co-referential relations between nominal constituents in Dutch. Based on the hypothesis that different types of information sources contribute to a correct resolution of different types (pronominal, proper noun and common noun) of co-referential links, we propose a modular approach in which a separate module is trained per NP type. We present a thorough comparison of two machine learning techniques, a lazy learner and an eager learning approach, trained on the modular tasks as well as on the undecomposed task. In addition, we show that by postprocessing the resulting co-reference chains by means of a string-edit distance correction mechanism, we can avoid some unlikely local chainings and thereby improve precision. Lacking comparative results for Dutch, we also report results on the English MUC-6 and MUC-7 data sets, which are widely used for evaluation.