Sentiment Analysis on Video Transcripts: Comparing the Value of Textual and Multimodal Annotations

Publication type
U
Publication status
Published
Authors
Du, Q, De Langhe, L., Lefever, E., & Hoste, V.
Editor
JinYeong Bak, Rob van der Goot, Hyeju Jang, Weerayut Buaphet, Alan Ramponi, Wei Xu and Alan Ritter
Series
Proceedings of the Tenth Workshop on Noisy and User-generated Text
Pagination
10-15
Publisher
Association for Computational Linguistics (Albuquerque, New Mexico, USA)
Conference
The Tenth Workshop on Noisy and User-generated Text (Albuquerque, New Mexico, USA)
Download
(.pdf)
View in Biblio
(externe link)

Abstract

This study explores the differences between textual and multimodal sentiment annotations on videos and their impact on transcript-based sentiment modelling. Using the UniC and CH-SIMS datasets which are annotated at both the unimodal and multimodal level, we conducted a statistical analysis and sentiment modelling experiments. Results reveal significant differences between the two annotation types, with textual annotations yielding better performance in sentiment modelling and demonstrating superior generalization ability. These findings highlight the challenges of cross-modality generalization and provide insights for advancing sentiment analysis.