Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

Oriol B, Canton-Ferrer C, Giró-i-Nieto X. Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation. In NeurIPS 2019 Workshop on AI for Social Good. Vancouver, Canada; 2019.

(1.91 MB)

Abstract

This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge. Memes are pixel-based multimedia documents that contain photos or illustrations together with phrases which, when combined, usually adopt a funny meaning. However, hate memes are also used to spread hate through social networks, so their automatic detection would help reduce their harmful societal impact. Our results indicate that the model can learn to detect some of the memes, but that the task is far from being solved with this simple architecture. While previous work focuses on linguistic hate speech, our experiments indicate how the visual modality can be much more informative for hate speech detection than the linguistic one in memes. In our experiments, we built a dataset of 5,020 memes to train and evaluate a multi-layer perceptron over the visual and language representations, whether independently or fused.

Paper on arXiv.
Project page on Github
Video recorded at NeurIPS 2019 Vancouver (starts at 1:40:00)
NeurIPS Joint Workshop on AI for Social Good 2019.

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation from Universitat Politècnica de Catalunya

Projects

	MALEGRA - Multimodal Signal Processing and Machine Learning on Graphs
	Language and Vision

Image Processing Group

Search form

User login

Abstract

Projects