A Modular Approach to Learning Dutch Co-reference Resolution
- Publication type
- Publication status
- Hoste, V., & van den Bosch, A.
- C. Johansson
- Proceedings from the first Bergen Workshop on Anaphora Resolution (WAR I)
- Cambridge Scholars Publishing
This paper presents the first machine learning approach to the resolution of co-referential relations between nominal constituents in Dutch. Based on the hypothesis that different types of information sources contribute to a correct resolution of different types (pronominal, proper noun and common noun) of co-referential links, we propose a modular approach in which a separate module is trained per NP type. We present a thorough comparison of two machine learning techniques, a lazy learner and an eager learning approach, trained on the modular tasks as well as on the undecomposed task. In addition, we show that by postprocessing the resulting co-reference chains by means of a string-edit distance correction mechanism, we can avoid some unlikely local chainings and thereby improve precision. Lacking comparative results for Dutch, we also report results on the English MUC-6 and MUC-7 data sets, which are widely used for evaluation.