Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media¶

Authors: Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Yang Wang, Elizabeth Belding
Venue: ICWSM, 2018 — arXiv:1804.04257

TL;DR¶

This paper presents the first large-scale linguistic and psycholinguistic analysis of hate speech categorized by its target: directed (personal attacks) versus generalized (targeting groups). The authors analyze 28,318 directed and ~2,500 generalized hate tweets, revealing that directed hate is more informal, angry, and explicitly attacks targets via name-calling; generalized hate is dominated by religious content and features lethal language and quantity words. The distinction has social and legal implications for free speech policy.

Contributions¶

First extensive study of hate speech distinguished by target (individual vs. group), rather than just binary hate/non-hate classification
Lexical analysis using SAGE (Sparse Additive Generative models) identifying salient words and semantic domains for each hate type
Psycholinguistic analysis using LIWC showing directed hate exhibits higher clout, lower analytical thinking, and more anger than generalized hate
Semantic frame analysis using SEMAFOR revealing directed hate invokes intentional acts and hindering frames; generalized hate evokes people-by-religion, killing, and quantity frames
Curated dataset of 28,318 directed and ~2,500 generalized hate tweets with human annotation achieving high inter-rater reliability (Krippendorff's α = 0.622)

Method¶

The authors employ a multi-step dataset collection approach:

Data collection: From Twitter Streaming API (Jan 2016–July 2017), they filter using: (1) Hatebase keywords (using Perspective API to ensure high-quality toxicity scores), (2) hate-related hashtags, (3) existing public datasets (Waseem & Hovy, Davidson et al., NHSM), and (4) general 1% sample for baseline comparison. To ensure directed hate targets individuals, they require @-mentions and second-person pronouns.

Annotation: 2,000-tweet sample annotated via CrowdFlower with 3+ independent annotators per tweet; annotators filtered to 80% accuracy baseline. Final dataset achieves 97.8% agreement on directed hate detection and 94.3% agreement on directedness classification.

Linguistic analysis: - SAGE: Supervised topic modeling to extract salient words distinguishing directed vs. generalized hate within each category (archaic, class, disability, ethnicity, gender, nationality, religion, sexual orientation) - Named Entity Recognition (T-NER): Extract entities to characterize what/who is being attacked - LIWC 2015: Psycholinguistic lexicon measuring analytic thinking, clout, authenticity, emotional tone, informality, social language, pronouns, negative emotions, temporal focus, personal concerns - SEMAFOR: Semantic frame analysis to identify high-level conceptual frames evoked (e.g., intentional acts, people-by-religion, killing)

Results¶

Lexical findings: - Directed hate shows minimal word overlap with generalized hate; each category (disability, ethnicity, etc.) has distinct topics - Entity analysis: Directed hate has 55.8% person mentions vs. 42.1% in generalized hate; generalized contains more religious entities (Islam, Jews, Muslims, Christians)

Psycholinguistic findings: - Directed hate: higher clout (μ = 70.7, p < 0.001) and lower analytical thinking (μ = 43.9, p < 0.001) - Directed hate: higher informality (μ = 17.1) and social language (μ = 16.1) - Generalized hate: emphasizes third-person pronouns ("they," 1.4 vs. "we," 0.5; 2.8× difference, p < 0.001) - Directed hate: higher anger (μ = 7.6) > Generalized (μ = 3.6) > General (μ = 0.9) - Generalized hate: higher death references (μ = 1.2, p = 0.1) vs. Directed (μ = 0.34, p < 0.001)

Semantic frames: - Directed hate evokes intentionally_act (0.05, p < 0.01), statement, and hindering frames - Generalized hate evokes people_by_religion (0.06, p < 0.01), killing (0.03), and quantity frames - Top words for directed: do, did, get, mentions, retard, retarded - Top words for generalized: kill, murder, exterminate, jews, christians, muslims, million

Connections¶

Hate speech detection — foundational linguistic characterization of hate speech types
Toxicity detection — psycholinguistic markers of toxicity differentiate personal vs. group-targeted abuse
Linguistic Analysis of Fake News — demonstrates application of LIWC, frame semantics, and sparse additive models to offensive language
Content moderation — implications for platform moderation policy distinguishing directed vs. generalized threats

Notes¶

Strengths: Rigorous human annotation pipeline (high inter-rater reliability); multi-faceted linguistic analysis (lexical, psycholinguistic, semantic) revealing nuanced distinctions; large, well-curated dataset; clear social/legal implications (First Amendment framing). The distinction between directed and generalized hate is practically useful for moderation and policy.

Limitations: Twitter-only; relies on keyword filtering (may miss implicit hate); annotators recruited from CrowdFlower (potential cultural/geographic biases); SEMAFOR not trained on Twitter may introduce domain-shift errors. The paper acknowledges that generalized hate can have wider mobilization impact than individual attacks, complicating the legal framing.

Follow-ups: Could extend to other platforms (Facebook, TikTok, Reddit); explore whether directed vs. generalized distinction predicts effectiveness of counter-speech or moderation interventions; investigate whether the linguistic markers generalize cross-lingually or across different subcommunities.