Offensive language detection¶

Computational identification and characterization of offensive language in social media, including abusive language, profanity, insults, threats, and targeted harassment. Offensive language detection is a broader umbrella encompassing hate speech detection, cyberbullying detection, and toxicity detection. Key distinctions include whether language is targeted (at individuals or groups) versus untargeted (general profanity), and what the target is (individual person vs. social group vs. other entity).

Key papers¶

Predicting the Type and Target of Offensive Posts in Social Media — OLID dataset introducing hierarchical three-level annotation scheme (offensive/not offensive → targeted/untargeted → individual/group/other target)
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media — Shared task competition with 115 systems evaluating OLID; BERT-based and ensemble approaches dominate; benchmark results across 104 offensive detection, 71 categorization, 66 target-identification teams
Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media — Distinguishes directed hate speech from generalized hate speech
[[2017-schmidt-hate-speech-detection]] — Comprehensive survey of NLP methods for hate speech detection

Hate speech detection — offensive language targeting protected groups
Toxicity detection — toxic and abusive language more broadly
Cyberbullying Detection — targeted harassment toward individuals
Content moderation — platform policies and moderation mechanisms

Offensive language detection¶

Key papers¶

Related topics¶