ENCORE corpus

Developers
Orphée De Clercq, Loic De Langhe and Véronique Hoste
Download
Request access

About ENCORE corpus

On this page you can get access to the ENCORE corpus: a collection of 1,115 Dutch news texts in which coreference between news events is annotated, both at a within and cross-document level. The articles in the corpus belong to a large variety of topics, ranging from geopolitical events to local news. All texts have been manually annotated following these guidelines

With the corpus the results can be reproduced as reported in De Langhe, L., De Clercq, O., & Hoste, V. (2023). Constructing a cross-document event coreference corpus for Dutch. LANGUAGE RESOURCES AND EVALUATION, 57(2), 819–848. https://doi.org/10.1007/s10579-022-09597-1.  

You can access this datasets by filling in your credentials at the top of this page. Please note that by downloading the data you agree to the following terms and conditions:

The authors and their affiliated institutions makes no warranties regarding the datasets provided. They cannot be held liable for providing access to the datasets or the usage of the datasets.
The dataset should only be used for scientific or research purposes. Any other use is explicitly prohibited.
The datasets must not be redistributed or shared in part or full with any third party. Redirect interested parties to this page.
If you use any of the datasets, you agree to cite the associated paper.