An emotional mess! Deciding on a framework for building a Dutch emotion-annotated corpus

Publication type
P1
Publication status
Published
Authors
De Bruyne, L., De Clercq, O., & Hoste, V.
Series
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020)
Pagination
1636-1644
Publisher
European Language Resources Association (ELRA)
Conference
12th International Conference on Language Resources and Evaluation (LREC) (Marseille, France)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Seeing the myriad of existing emotion models, with the categorical versus dimensional opposition the most important dividing line, building an emotion-annotated corpus requires some well thought-out strategies concerning framework choice. In our work on automatic emotion detection in Dutch texts, we investigate this problem by means of two case studies. We find that the labels joy, love, anger, sadness and fear are well-suited to annotate texts coming from various domains and topics, but that the connotation of the labels strongly depends on the origin of the texts. Moreover, it seems that information is lost when an emotional state is forcedly classified in a limited set of categories, indicating that a bi-representational format is desirable when creating an emotion corpus.