Publication:
Spanish MTLHateCorpus 2023: multi-task learning for hate speech detection to identify speech type, target, target group and intensity

relationships.isAuthorOfPublication
relationships.isSecondaryAuthorOf
relationships.isDirectorOf
Authors
Ronghao Pan ; García Díaz, José Antonio ; Valencia García, Rafael
item.page.secondaryauthor
Facultades de la UMU::Facultad de Informática
item.page.director
Publisher
Elsevier
publication.page.editor
publication.page.department
DOI
https://doi.org/10.1016/j.csi.2025.103990
item.page.type
info:eu-repo/semantics/article
Description
Abstract
The rise of digital communication has exacerbated the challenge of tackling harmful speech online, particularly hate speech, which dehumanises individuals or groups on the basis of traits such as race, gender or ethnicity. This study highlights the urgent need for fine-grained detection methods that take into account several subtasks of hate speech detection, including its intensity, determining the groups to which hate speech is directed, and whether the target is an individual or a group. Furthermore, there is a gap in comprehensive Spanish language corpora that cover these subtasks of hate speech detection. Therefore, we created a novel corpus entitled Spanish MTLHateCorpus 2023 to facilitate the analysis of hate speech in these subtasks and evaluated the effectiveness of the multi-task learning strategy evaluating mBART and T5, comparing its results with other Large Language Models using Zero-Shot Learning as a lower bound and an ensemble based on the mode of several Fine-Tuning as an upper bound. The results achieved by the Multi-Task Learning strategy demonstrated its potential to increase model versatility, allowing a single model to effectively tackle multiple tasks while achieving competitive results, particularly in target group recognition. However, the ensemble learning slightly outperforms the Multi-Task Learning strategy.
Citation
Computer Standards & Interfaces, 2025, Vol. 94 : 103990
item.page.embargo
Collections