This article presents the novel manually annotated Trilingual Recognition of Irony with Confidence (TRIC) dataset for English, Dutch and Italian, publicly available on Hugging Face as Amala3/TRIC. The annotations for this dataset include irony likelihood labels, indicating how likely the annotators believe a text is ironic, as well as trigger words, indicating which words in a sentence are essential for understanding the irony. In addition to the dataset, this work investigates the development of confidence-aware models for irony detection in a monolingual and multilingual setup. Results show that finetuning encoder-only models with confidence-aware labels improves the performance on binary irony detection and that finetuning on task-specific data in multiple languages also results in increased performance. Comparison to finetuned Llama3 indicates that generative decoder-only models perform better than confidence-aware models for English, but that encoder-only models perform best for less-resourced languages (both Dutch and Italian). Analysis of trigger words of both humans and automatic systems suggests that token-level importance differ significantly, but that n-gram based clustering can reveal deeper insights. In all three languages,
automatic systems tend to rely more on hyperbolic positive sentiment and interjections, whereas humans more often identify topics that are relevant to understand irony.