Prospects for Dutch emotion detection : insights from the new EmotioNL dataset

Publication type
A2
Publication status
Published
Authors
De Bruyne, L., De Clercq, O., & Hoste, V.
Journal
COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS JOURNAL
Volume
11
Pagination
231-255
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Although emotion detection has become a crucial research direction in NLP, the main focus is on English resources and data. The main obstacles for more specialized emotion detection are the lack of annotated data in smaller languages and the limited emotion taxonomy. In a first step towards improving emotion detection for Dutch, we present EmotioNL, an emotion dataset consisting of 1,000 Dutch tweets and 1,000 captions from TV-shows, annotated with emotion categories (anger, fear, joy, love, sadness and neutral) and dimensions (valence, arousal and dominance). We evaluate the state-of-the-art Dutch transformer models BERTje and RobBERT on this new dataset, investigate model generalizability across domains and perform a thorough error analysis based on the Component Process Model of emotions.