Skip to content

Hate speech detection and moderation

Detection, classification, and removal of hateful speech targeting protected groups or promoting violence and discrimination. Encompasses algorithmic detection systems, human content moderation, and platform policy approaches.

Key papers

  • Horta Ribeiro et al. (2019) — characterizes Alt-right YouTube channels and their hate speech content; audits how platform recommendations reach users across a spectrum of hateful content.