Investigating cross-document event coreference for Dutch

Publication type
C1
Publication status
Published
Authors
De Langhe, L., De Clercq, O., & Hoste, V.
Series
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2022)
Pagination
88-98
Publisher
Association for Computational Linguistics (Gyeongju, Republic of Korea)
Conference
Fifth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2022) (Gyeongju, Republic of Korea)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

In this paper we present baseline results for Event Coreference Resolution (ECR) in Dutch using gold-standard (i.e non-predicted) event mentions. A newly developed benchmark dataset allows us to properly investigate the possibility of creating ECR systems for both within and cross-document coreference. We give an overview of the state of the art for ECR in other languages, as well as a detailed overview of existing ECR resources. Afterwards, we provide a comparative report on our own dataset. We apply a significant number of approaches that have been shown to attain good results for English ECR including feature-based models, monolingual transformer language models and multilingual language models. The best results were obtained using the monolingual BERTje model. Finally, results for all models are thoroughly analysed and visualised, as to provide insight into the inner workings of ECR and long-distance semantic NLP tasks in general.