Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking¶

Authors: Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, Yejin Choi

Venue: EMNLP 2017

TL;DR¶

This paper performs a linguistic analysis of fake news across four categories—satire, hoaxes, propaganda, and trusted news—using lexical features (LIWC, subjectivity, hedging, intensifiers). Experiments show that fake news articles use more first-person pronouns, exaggerative language (superlatives, modal adverbs), and hedging, while trusted news uses more concrete language and assertive verbs. The authors also present a 6-point graded truthfulness prediction task on PolitiFact data, achieving 22% F1 on 6-class prediction and 52% on 2-class.

Contributions¶

Comprehensive linguistic analysis comparing fake news types (satire, hoax, propaganda) with trusted news on 13+ lexical dimensions
Corpus of ~75k news articles from reliable and unreliable sources, organized by news type and intent
4-way news reliability classification (trusted/satire/hoax/propaganda) achieving 65% F1
Fine-grained truthfulness prediction task on 4,366 PolitiFact statements across a 6-point scale
Demonstration that stylistic lexical features help distinguish between deceptive and trustworthy text

Method¶

The paper analyzes language across two dimensions:

Section 2: Fake News Analysis - Collected articles from English Gigaword (trusted news) and seven unreliable news sites, categorized as satire (The Onion, Borowitz Report, Clickhole), hoax (American News, DC Gazette), and propaganda (The Natural News, Activist Report). - Applied lexical resources: LIWC (Linguistic Inquiry and Word Count), sentiment lexicons for subjectivity (Wilson et al., 2005), hedging lexicons (Hyland, 2015), and custom intensifier lists from Wiktionary (comparatives, superlatives, action/manner/modal adverbs). - Computed per-document counts for each lexicon and reported averages, with statistical significance testing (Welch t-test with Bonferroni correction). - Trained a Max-Entropy classifier with L2 regularization on n-gram tf-idf features (up to trigrams) in a cross-domain setting (training on mixed sources, testing on held-out sources).

Section 3: Predicting Truthfulness - Collected 10,483 labeled statements from PolitiFact and related sites, using a subset of 4,366 direct quotes. - Formulated a 6-class prediction task (True, Mostly True, Half-true, Mostly False, False, Pants-on-Fire) and a binary (2-class) version. - Compared three models: LSTM (with GloVe embeddings), Max-Entropy, and Naive Bayes, with optional LIWC features concatenated. - Split data {train: 2575, dev: 712, test: 1074} with speaker-aware splits to avoid test leakage.

Results¶

News Reliability (4-way classification) - Max-Entropy classifier achieved 65% F1 on out-of-domain test set (25% is random baseline). - Development set (in-domain) achieved 91% F1, indicating domain shift is the main challenge.

Linguistic Feature Findings (ratios of fake/trusted frequencies): - First-person singular pronouns: 2.06× more in fake news - Second-person pronouns: 6.73× (especially DC Gazette hoaxes) - Swearing: 7.00× (Borowitz Report satire) - Modal adverbs, action adverbs, manner adverbs: 2.63×, 2.18×, 1.87× in fake news - Strong subjectives: 1.51× in fake news - Hedging: 1.19× in fake news - Superlatives: 1.17× in propaganda

Conversely, trusted news shows: - Numbers: 0.43× (more in trusted) - Assertive verbs: 0.84× (more in trusted) - Comparatives: 0.86× (more in trusted) - Money references: 0.57× (more in trusted)

Fine-grained Truthfulness Prediction - On PolitiFact data, LSTM with text alone achieved 58% macro F1 on 2-class (dev) and 56% on test. - Adding LIWC features improved Max-Entropy and Naive Bayes (both to 55–58% on 2-class), but hurt or negligibly helped LSTM. - 6-class prediction is harder: best models achieved ~22% macro F1 (close to 20% random baseline for near-uniform class distribution).

Connections¶

Related to Propagation-based fake news detection in that linguistic features complement graph-based approaches.
Extends Rubin 2015 Deception Detection by quantifying linguistic differences across fake news types (satire vs. hoax vs. propaganda).
Cited by 2022 Wang Detecting Propaganda and other fake news detection work using LIWC features.
Foundational for understanding the role of hedging, subjectivity, and intensity in deceptive text.

Notes¶

Strengths: - First systematic study to differentiate fake news types (satire, hoax, propaganda) linguistically, not just binary true/false. - Graded labels on PolitiFact (6-point scale) reflect real-world fact-checking complexity better than binary classifications. - Comprehensive lexical analysis across multiple domains and sources. - Public release of corpus and PolitiFact annotations supports reproducibility.

Limitations: - The 65% F1 on 4-way news classification and 22% on 6-class truthfulness prediction leave significant room for improvement, suggesting stylistic features alone are insufficient. - Cross-domain test (different sources) performs much worse than in-domain, indicating overfitting to source-specific cues. - LSTM + LIWC features do not improve over text alone, suggesting some redundancy in the feature engineering. - Graded truthfulness task is inherently harder; the paper does not deeply analyze why intermediate labels (e.g., "Mostly True") are hard to predict.

Impact: - Established a baseline for linguistic analysis in fake news detection and motivated follow-up work on graded labels and context-aware models. - The PolitiFact dataset remains widely used in fact-checking research.