Skip to content

Hate speech detection

Detection of hateful speech—language that attacks individuals or groups based on protected characteristics such as race, ethnicity, religion, gender identity, or other identity attributes. Hate speech detection is a specialized subdomain of toxicity detection with unique challenges: the targeted nature of attacks, implicit forms of hatred, and the need to distinguish harmful speech from protected speech and satire.

Key papers