Online suicide prevention through optimised text classification

Publication type: A1
Publication status: Submitted
Authors: Desmet, B., & Hoste, V.
Journal: Information Sciences
Publisher: Elsevier
Projects: SubTLe, AMiCA

Abstract

Online communication platforms are increasingly used to express suicidal thoughts. There is considerable interest in monitoring such messages, both for population-wide and individual prevention purposes, and to inform suicide research and policy. Online information overload prohibits manual detection, which is why keyword search methods are typically used. However, these are imprecise and unable to handle implicit references or linguistic noise. As an alternative, this study investigates supervised text classification to model and detect suicidality in Dutch-language forum posts. Genetic algorithms were used to optimize models through feature selection and hyperparameter optimisation. A variety of features was found to be informative, including token and character ngram bags-of-words, presence of salient suicide-related terms and features based on LSA topic models and polarity lexicons. The results indicate that text classification is a viable and promising strategy for detecting suicide-related and alarming messages, with F-scores comparable to human annotators (93% for relevant messages, 70% for severe messages). Both types of messages can be detected with high precision and minimal noise, even on large high-skew corpora. This suggests that they would be fit for use in a real-world prevention setting.

Dec. 17, 2025	On how GPT-4o, Gemini-2.5 and DeepSeek-R1 have been used in lexicography
Oct. 31, 2025	PhD Defense Sofie
Oct. 6, 2025	PhD Defense Aaron
Oct. 2, 2025	Tekom Belgium at the LT3 offices
Sept. 29, 2025	Francesca at ICLC 11