Skip to content

Harmful content detection

Detection and classification of online content that causes harm including hate speech, cyberbullying, violent content, sexually explicit material, spam, and trolling. Distinct from misinformation detection (which focuses on factuality) but often addressed simultaneously in real-world content moderation systems.

Key papers