Beyond News Contents: The Role of Social Context for Fake News Detection¶

Authors: Kai Shu, Suhang Wang, Huan Liu Venue: The Twelfth ACM International Conference on Web Search and Data Mining (WSDM '19), February 11–15, 2019, Melbourne, VIC, Australia DOI: 10.1145/3289600.3290994

TL;DR¶

Content-only detection is insufficient because fake news is crafted to mimic real news. This paper argues that the tri-relationship among publishers, news pieces, and users forms rich social context for detection: partisan-biased publishers more likely publish fake news, and low-credibility users more likely share it. TriFN jointly models publisher-news relations and user-news interactions via embedding, achieving 4–6% F1 improvement over baselines on FakeNewsNet, and maintains >80% F1 even 48 hours after publication, enabling early detection.

Contributions¶

Proposes the tri-relationship embedding perspective: publishers, news, and users form an inherent triadic network where each relationship provides complementary signals.
Introduces publisher partisan bias as a structural signal: partisan publishers distort facts to align with their bias, making partisan-labeled news more predictive than content alone.
Incorporates user credibility into user-news interactions: models interactions weighted by per-user credibility scores inferred from clustering patterns.
Develops TriFN (Tri-relationship embedding Framework), a joint learning model combining: (1) news content embedding (NMF), (2) user social embedding (factorization), (3) user-news interaction embedding (credibility-weighted), (4) publisher-news relation embedding (partisan-bias regularization), and (5) semi-supervised classification.
Demonstrates early fake news detection: achieves >80% F1 within 48 hours on both BuzzFeed and PolitiFact partitions of FakeNewsNet.

Method¶

Problem setup. Given: news content matrix $X \in \mathbb{R}^{n \times t}$ (bag-of-words), user-user adjacency matrix $A$, user-news interaction matrix $W$ (binary), publisher-news matrix $B$, and partisan bias labels $o \in \{-1, 0, 1\}^{l \times 1}$ for subset of publishers. Goal: classify unlabeled news as fake ($y_j = 1$) or true ($y_j = -1$).

News content embedding. Nonnegative matrix factorization projects news from bag-of-words to latent semantic space: $$\min_{D,V \geq 0} \|X - DV^T\|_F^2 + \lambda(\|D\|_F^2 + \|V\|_F^2)$$ where $D \in \mathbb{R}^{n \times d}$ is the news latent feature matrix.

User social embedding. NMF on user-user adjacency captures social homophily (like-minded users form relationships): $$\min_{U,T \geq 0} \|Y \odot (A - UU^TU^T)\|_F^2 + \lambda(\|U\|_F^2 + \|T\|_F^2)$$

User-news interaction embedding. Models the observation that high-credibility users preferentially share true news, and low-credibility users preferentially share fake news. User credibility $c_i \in [0, 1]$ inferred from clustering (less credible users coordinate in larger clusters). The interaction term: $$\min_{U,D_L \geq 0} \sum_{i,j} W_{ij} \left[ c_i(1 - \tfrac{1 + y_{Lj}}{2}) + (1-c_i)(\tfrac{1 + y_{Lj}}{2}) \right] \|U_i - D_{Lj}\|_2^2$$ weighs distance loss: high-credibility users pulled toward true news, low-credibility toward fake news.

Publisher-news relation embedding. Partisan publishers drive fake news. Regularize latent features so that publisher latent representation (averaged across published news) predicts partisan bias: $$\min_{D \geq 0, q} \|e \odot (B\bar{D}q - o)\|_2^2 + \lambda\|q\|_2^2$$ where $\bar{B}$ is row-normalized.

Unified objective (TriFN). Combine all components via a joint optimization problem with hyperparameters $\alpha, \beta, \gamma, \eta$: $$\min \|X - DV^T\|_F^2 + \alpha \|Y \odot (A - UU^T)\|_F^2 + \beta \text{tr}(H^T LH) + \gamma \|e \odot (B\bar{D}q - o)\|_2^2 + \eta \|D_L p - y_L\|_2^2 + \lambda R$$

Solved via alternating least squares (ALS) with KKT conditions; convergence guaranteed.

Results¶

Datasets. FakeNewsNet (Shu et al. 2018): BuzzFeed (182 news, 91 fake, 15K users, 25K engagements) and PolitiFact (240 news, 120 fake, 24K users, 37K engagements).

Baselines. Content-only (RST, LIWC), social-only (Castillo), and hybrid (RST+Castillo, LIWC+Castillo). Each tested across 7 classifiers (LogReg, Naïve Bayes, DTree, RForest, XGBoost, AdaBoost, GradBoost); best reported.

Main results (Table 2):

Method	BuzzFeed F1	PolitiFact F1
RST	0.633	0.615
LIWC	0.709	0.666
Castillo	0.797	0.822
RST+Castillo	0.805	0.835
LIWC+Castillo	0.822	0.843
TriFN	0.870	0.880

TriFN achieves 4.72% relative improvement (BuzzFeed Accuracy), 5.84% (BuzzFeed F1), 5.91% (PolitiFact Accuracy), 4.39% (PolitiFact F1) vs. LIWC+Castillo.

Component ablation (Figure 3). Removing publisher partisan ($\gamma=0$) drops F1 ~3–4%. Removing user social engagements ($\alpha, \beta=0$) drops F1 ~5–7%. Removing both degrades further, confirming complementarity.

Early detection (Figure 4). At 12 hours post-publication: TriFN F1 ~0.60–0.65. At 48 hours: F1 >0.80. Consistent advantage over baselines even with sparse early data.

Parameter analysis (Figure 5). Sensitivity to $\eta$ (classification weight) and $\gamma$ (publisher bias weight): performance plateau in reasonable ranges ($\eta \in [1, 50]$, $\gamma \in [1, 50]$); stable for social term coefficients $\alpha, \beta \in [10^{-5}, 10^{-3}]$.

Connections¶

FakeNewsNet dataset: the benchmark used; also see Shu et al. (2018) dataset paper.
User profiles for fake news: related work; see Shu et al. (2019) UPF for complementary user-feature approach.
Publisher partisan bias: core signal; relates to journalism research on partisan distortion.
User credibility: inferred via clustering; see Castillo et al. (2011) for complementary stance-based credibility.
Social context for detection: exemplar of tri-partite network analysis for misinformation.
Semi-supervised learning: labels only a subset of news; leverages unlabeled signal.
Shu et al. (2019) — Hierarchical RvNN: contemporaneous work from same group using RNNs on propagation trees.
Shu et al. (2019) — DEFEND: extends to explainable detection.

Notes¶

Strength. The tri-relationship framing is intuitive and empirically well-motivated. Partisan bias as a structural signal is novel compared to prior content/user-based work. Early detection (48 hrs) is practically important. Joint learning captures complementarity that concatenated features do not.

Limitations. - User credibility is inferred from clustering patterns, assuming malicious users coordinate. This may not hold in all domains (e.g., organic misinformation). - Partisan bias labels sourced from Media Bias Fact Check (MBFC), a subjective source; only "left," "least-biased," "right" labels used, excluding nuance. Generalization to non-US-political or non-partisan misinformation unclear. - Evaluation limited to FakeNewsNet's fact-checked news (BuzzFeed, PolitiFact). Real-world fake news evolves; static datasets may not capture temporal drift. - Comparison with propagation-based methods (e.g., rumor diffusion networks) explicitly excluded; architectural choice limits scope of conclusions about social signals. - Hyperparameter tuning on both datasets; some overfitting risk given dataset sizes (91–120 fake per partition).

Significance. TriFN helped establish social context as a first-class signal in fake news detection, influencing subsequent work on graph-based and heterogeneous network approaches. The tri-relationship concept is reused in later papers (e.g., hierarchical propagation trees, explainable systems).