UniC : a dataset for emotion analysis of videos with multimodal and unimodal labels

Publication type
A1
Publication status
In press
Authors
Du, Q, Labat, S., Demeester, T., & Hoste, V.
Journal
LANGUAGE RESOURCES AND EVALUATION
Pagination
1-36
Download
(.pdf)
View in Biblio
(externe link)

Abstract

Emotion is a key characteristic that differentiates humans from machines. It is intricate, encompassing a wide variety of emotional states, and is expressed through both verbal and non-verbal communication channels. Different modalities contribute in unique ways to the integrated expression of emotion. However, in most of the existing multimodal datasets, there is only one unified emotion label for the various modalities, ignoring the heterogeneity and complementarity of the different modalities. To bridge this gap, we introduce UniC, a novel multimodal emotion dataset featuring both integrated multimodal labels and independent unimodal labels. UniC
is comprised of 965 emotion-rich video clips selected from YouTube, annotated in text, audio, silent video, and multimodal setups with both categorical and dimensional labels. We outline the steps taken to construct the dataset and analyze different modality perspectives in UniC. Our findings indicate that while in most cases the modality of text shares more emotional resemblance with the multimodal setup, other modalities can exhibit different, sometimes even opposite emotions that might contribute more to the overall emotion state. This dataset offers a modality-specific perspective on multimodal emotion analysis and has the potential to provide valuable insights for further research in human emotion understanding.