Beyond News Contents: The Role of Social Context for Fake News Detection¶
Authors: Kai Shu, Suhang Wang, Huan Liu Venue: The Twelfth ACM International Conference on Web Search and Data Mining (WSDM '19), February 11–15, 2019, Melbourne, VIC, Australia DOI: 10.1145/3289600.3290994
TL;DR¶
Content-only detection is insufficient because fake news is crafted to mimic real news. This paper argues that the tri-relationship among publishers, news pieces, and users forms rich social context for detection: partisan-biased publishers more likely publish fake news, and low-credibility users more likely share it. TriFN jointly models publisher-news relations and user-news interactions via embedding, achieving 4–6% F1 improvement over baselines on FakeNewsNet, and maintains >80% F1 even 48 hours after publication, enabling early detection.
Contributions¶
- Proposes the tri-relationship embedding perspective: publishers, news, and users form an inherent triadic network where each relationship provides complementary signals.
- Introduces publisher partisan bias as a structural signal: partisan publishers distort facts to align with their bias, making partisan-labeled news more predictive than content alone.
- Incorporates user credibility into user-news interactions: models interactions weighted by per-user credibility scores inferred from clustering patterns.
- Develops TriFN (Tri-relationship embedding Framework), a joint learning model combining: (1) news content embedding (NMF), (2) user social embedding (factorization), (3) user-news interaction embedding (credibility-weighted), (4) publisher-news relation embedding (partisan-bias regularization), and (5) semi-supervised classification.
- Demonstrates early fake news detection: achieves >80% F1 within 48 hours on both BuzzFeed and PolitiFact partitions of FakeNewsNet.
Method¶
Problem setup. Given: news content matrix \(X \in \mathbb{R}^{n \times t}\) (bag-of-words), user-user adjacency matrix \(A\), user-news interaction matrix \(W\) (binary), publisher-news matrix \(B\), and partisan bias labels \(o \in \{-1, 0, 1\}^{l \times 1}\) for subset of publishers. Goal: classify unlabeled news as fake (\(y_j = 1\)) or true (\(y_j = -1\)).
News content embedding. Nonnegative matrix factorization projects news from bag-of-words to latent semantic space: $\(\min_{D,V \geq 0} \|X - DV^T\|_F^2 + \lambda(\|D\|_F^2 + \|V\|_F^2)\)$ where \(D \in \mathbb{R}^{n \times d}\) is the news latent feature matrix.
User social embedding. NMF on user-user adjacency captures social homophily (like-minded users form relationships): $\(\min_{U,T \geq 0} \|Y \odot (A - UU^TU^T)\|_F^2 + \lambda(\|U\|_F^2 + \|T\|_F^2)\)$
User-news interaction embedding. Models the observation that high-credibility users preferentially share true news, and low-credibility users preferentially share fake news. User credibility \(c_i \in [0, 1]\) inferred from clustering (less credible users coordinate in larger clusters). The interaction term: $\(\min_{U,D_L \geq 0} \sum_{i,j} W_{ij} \left[ c_i(1 - \tfrac{1 + y_{Lj}}{2}) + (1-c_i)(\tfrac{1 + y_{Lj}}{2}) \right] \|U_i - D_{Lj}\|_2^2\)$ weighs distance loss: high-credibility users pulled toward true news, low-credibility toward fake news.
Publisher-news relation embedding. Partisan publishers drive fake news. Regularize latent features so that publisher latent representation (averaged across published news) predicts partisan bias: $\(\min_{D \geq 0, q} \|e \odot (B\bar{D}q - o)\|_2^2 + \lambda\|q\|_2^2\)$ where \(\bar{B}\) is row-normalized.
Unified objective (TriFN). Combine all components via a joint optimization problem with hyperparameters \(\alpha, \beta, \gamma, \eta\): $\(\min \|X - DV^T\|_F^2 + \alpha \|Y \odot (A - UU^T)\|_F^2 + \beta \text{tr}(H^T LH) + \gamma \|e \odot (B\bar{D}q - o)\|_2^2 + \eta \|D_L p - y_L\|_2^2 + \lambda R\)$
Solved via alternating least squares (ALS) with KKT conditions; convergence guaranteed.
Results¶
Datasets. FakeNewsNet (Shu et al. 2018): BuzzFeed (182 news, 91 fake, 15K users, 25K engagements) and PolitiFact (240 news, 120 fake, 24K users, 37K engagements).
Baselines. Content-only (RST, LIWC), social-only (Castillo), and hybrid (RST+Castillo, LIWC+Castillo). Each tested across 7 classifiers (LogReg, Naïve Bayes, DTree, RForest, XGBoost, AdaBoost, GradBoost); best reported.
Main results (Table 2):
| Method | BuzzFeed F1 | PolitiFact F1 |
|---|---|---|
| RST | 0.633 | 0.615 |
| LIWC | 0.709 | 0.666 |
| Castillo | 0.797 | 0.822 |
| RST+Castillo | 0.805 | 0.835 |
| LIWC+Castillo | 0.822 | 0.843 |
| TriFN | 0.870 | 0.880 |
TriFN achieves 4.72% relative improvement (BuzzFeed Accuracy), 5.84% (BuzzFeed F1), 5.91% (PolitiFact Accuracy), 4.39% (PolitiFact F1) vs. LIWC+Castillo.
Component ablation (Figure 3). Removing publisher partisan (\(\gamma=0\)) drops F1 ~3–4%. Removing user social engagements (\(\alpha, \beta=0\)) drops F1 ~5–7%. Removing both degrades further, confirming complementarity.
Early detection (Figure 4). At 12 hours post-publication: TriFN F1 ~0.60–0.65. At 48 hours: F1 >0.80. Consistent advantage over baselines even with sparse early data.
Parameter analysis (Figure 5). Sensitivity to \(\eta\) (classification weight) and \(\gamma\) (publisher bias weight): performance plateau in reasonable ranges (\(\eta \in [1, 50]\), \(\gamma \in [1, 50]\)); stable for social term coefficients \(\alpha, \beta \in [10^{-5}, 10^{-3}]\).
Connections¶
- FakeNewsNet dataset: the benchmark used; also see Shu et al. (2018) dataset paper.
- User profiles for fake news: related work; see Shu et al. (2019) UPF for complementary user-feature approach.
- Publisher partisan bias: core signal; relates to journalism research on partisan distortion.
- User credibility: inferred via clustering; see Castillo et al. (2011) for complementary stance-based credibility.
- Social context for detection: exemplar of tri-partite network analysis for misinformation.
- Semi-supervised learning: labels only a subset of news; leverages unlabeled signal.
- Shu et al. (2019) — Hierarchical RvNN: contemporaneous work from same group using RNNs on propagation trees.
- Shu et al. (2019) — DEFEND: extends to explainable detection.
Notes¶
Strength. The tri-relationship framing is intuitive and empirically well-motivated. Partisan bias as a structural signal is novel compared to prior content/user-based work. Early detection (48 hrs) is practically important. Joint learning captures complementarity that concatenated features do not.
Limitations. - User credibility is inferred from clustering patterns, assuming malicious users coordinate. This may not hold in all domains (e.g., organic misinformation). - Partisan bias labels sourced from Media Bias Fact Check (MBFC), a subjective source; only "left," "least-biased," "right" labels used, excluding nuance. Generalization to non-US-political or non-partisan misinformation unclear. - Evaluation limited to FakeNewsNet's fact-checked news (BuzzFeed, PolitiFact). Real-world fake news evolves; static datasets may not capture temporal drift. - Comparison with propagation-based methods (e.g., rumor diffusion networks) explicitly excluded; architectural choice limits scope of conclusions about social signals. - Hyperparameter tuning on both datasets; some overfitting risk given dataset sizes (91–120 fake per partition).
Significance. TriFN helped establish social context as a first-class signal in fake news detection, influencing subsequent work on graph-based and heterogeneous network approaches. The tri-relationship concept is reused in later papers (e.g., hierarchical propagation trees, explainable systems).