Automatic classification of participant roles in cyberbullying : can we detect victims, bullies, and bystanders in social media text?

Publication type: A1
Publication status: Published
Authors: Jacobs, G.M., Van Hee, C., & Hoste, V.
Journal: NATURAL LANGUAGE ENGINEERING
Volume: 28
Issue: 2
Pagination: 141-166
Download
View in Biblio

Abstract

Successful prevention of cyberbullying depends on the adequate detection of harmful messages. Given the impossibility of human moderation on the Social Web, intelligent systems are required to identify clues of cyberbullying automatically. Much work on cyberbullying detection focuses on detecting abusive language without analyzing the severity of the event nor the participants involved. Automatic analysis of participant roles in cyberbullying traces enables targeted bullying prevention strategies. In this paper, we aim to automatically detect different participant roles involved in textual cyberbullying traces, including bullies, victims, and bystanders. We describe the construction of two cyberbullying corpora (a Dutch and English corpus), that were both manually annotated with bullying types and participant roles and we perform a series of multiclass classification experiments to determine the feasibility of text-based cyberbullying participant role detection. The representative datasets present a data imbalance problem for which we investigate feature filtering and data resampling as skew mitigation techniques. We investigate the performance of feature-engineered single and ensemble classifier setups as well as transformer-based pre-trained language models. Cross-validation experiments revealed promising results for the detection of cyberbullying roles using pre-trained language model fine-tuning techniques, with the best classifier for English (RoBERTa) yielding a macro-averaged F1-score of 55.84%, and the best one for Dutch (RobBERT) yielding an F1-score of 56.73%. Experiment replication data and source code is available at https://osf.io/nb2r3.

June 27, 2025	Workshop CALM Work Placements
June 12, 2025	LT3 at LTRC, ICTIC, NITS and DHBenelux
June 5, 2025	Podcast Episode Dwars Door de Klas
June 3, 2025	PhD Defense Margot 🎓
May 30, 2025	The road towards fine-tuned LLMs for lexicography