CBRCL: A CLIP-BERT with Retrieval-Guided Contrastive Learning Multimodal Approach for Crisis-Driven Hate Speech Detection
Stepanov, Ilya ; Rashid, Junaid ; Lee, Jong Weon ; Naseem, Salman ; Razzak, Imran
Stepanov, Ilya
Rashid, Junaid
Lee, Jong Weon
Naseem, Salman
Razzak, Imran
Supervisor
Department
Computational Biology
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The proliferation of multimodal text and image data has raised the challenge of detecting hate speech, a critical issue in mitigating the spread of hate and propaganda on social media. The challenge in detecting hateful posts arises when the combination of images and text, which individually lack hateful meaning, collectively convey hatefulness. To address this problem, we propose CBRCL, a novel approach leveraging CLIP and BERT for embedding generation and retrieval-guided contrastive learning to enhance hate speech detection. Our approach is evaluated on two datasets: CrisisHateMM, comprising text-embedded images associated with the Ukraine-Russia conflict, and LoveHate, focused on text-embedded images from the Israel-Palestine conflict. Experimental results demonstrate that CBRCL outperforms both unimodal and multimodal baselines, achieved an accuracy of 86.4% on CrisisHateMM and 76.3% on LoveHate datasets. By combining CLIP's strong visual-text alignment with BERT's deep contextual understanding, CBRCL effectively captures subtle multimodal hate speech. Additionally, retrieval-guided contrastive learning refines embedding spaces, ensuring a clearer distinction between hateful and non-hateful content. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Citation
I. Stepanov, J. Rashid, J. W. Lee, S. Naseem, and I. Razzak, “CBRCL: A CLIP-BERT with Retrieval-Guided Contrastive Learning Multimodal Approach for Crisis-Driven Hate Speech Detection,” pp. 1993–1999, May 2025, doi: 10.1145/3701716.3718384
Source
WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
Conference
34th ACM Web Conference, WWW Companion 2025
Keywords
Hate Speech Detection, Multimodal, Social Media
Subjects
Source
34th ACM Web Conference, WWW Companion 2025
Publisher
Association for Computing Machinery
