Skip to content

GUIDE

Hate speech detection and moderation

Hate speech detection and moderation¶

Detection, classification, and removal of hateful speech targeting protected groups or promoting violence and discrimination. Encompasses algorithmic detection systems, human content moderation, and platform policy approaches.

Key papers¶

Horta Ribeiro et al. (2019) — characterizes Alt-right YouTube channels and their hate speech content; audits how platform recommendations reach users across a spectrum of hateful content.

Political extremism and radicalization — far-right and extremist rhetoric
Content moderation — platform governance and policy enforcement
Online Harassment — coordinated hate campaigns
Platform auditing and analysis — empirical audit of platform content and algorithms