Skip to content

GUIDE

Harmful content detection

Harmful content detection¶

Detection and classification of online content that causes harm including hate speech, cyberbullying, violent content, sexually explicit material, spam, and trolling. Distinct from misinformation detection (which focuses on factuality) but often addressed simultaneously in real-world content moderation systems.

Key papers¶

A Survey on Multimodal Disinformation Detection — comprehensive survey covering harmful content detection across text, image, speech, video, and network modalities
Hameleers et al. (2020) — experimental evidence on multimodal harmful content perception and fact-checking effectiveness

Multimodal fake news detection (detection across modalities)
Hate speech detection and moderation (subset of harmful content)
Disinformation (distinct but often co-occurring)