dEFEND: Explainable Fake News Detection¶
Authors: Kai Shu, Limeng Cui, Suhang Wang, Dongywon Lee, Huan Liu Venue: Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '19), August 4–8, 2019, Anchorage, AK — DOI
TL;DR¶
Fake news detection on social media typically focuses on content alone, but user comments and reactions provide complementary signals. This paper proposes dEFEND, a hierarchical attention neural network that jointly encodes news sentences and user comments to both detect fake news and explain why a piece of news is detected as fake by identifying key sentences and comments. On PolitiFact and GossipCop, dEFEND achieves accuracy 0.904 and 0.808, respectively, outperforming content-only and social-context-only baselines.
Contributions¶
- Introduces the novel problem of explainable fake news detection: not just classifying news as fake/real, but identifying which sentences and comments drive the prediction.
- Proposes dEFEND, a four-component neural architecture: (1) news content encoder (word-level + sentence-level with hierarchical attention), (2) user comment encoder, (3) sentence-comment co-attention module, and (4) fake news prediction component.
- Demonstrates through ablations and human evaluation that news content features and user comment features capture complementary information; jointly modeling them improves performance significantly.
- Conducts extensive experiments on FakeNewsNet (PolitiFact and GossipCop) and includes human evaluation of explainability via Amazon Mechanical Turk.
Method¶
News Content Encoding. News articles are first segmented into sentences; each sentence into words. The model learns word representations via word embedding; a bidirectional GRU (Gated Recurrent Unit) encodes word sequences with forward and backward hidden states concatenated. Attention weights on words are computed via learned dot-product attention, producing a sentence-level representation. Sentence representations are then encoded via bidirectional GRU with similar attention mechanism, producing document-level representations that highlight important sentences.
User Comments Encoding. User comments posted on the news are encoded analogously: bidirectional GRU + attention at word level, then comment-level encoding.
Sentence-Comment Co-attention. Not all sentences in news contents are equally important for fake news detection. Similarly, comments are valued at different levels based on how well they relate to the news content. A co-attention mechanism jointly models the mutual influence between sentences and comments: attention weights for sentences are computed conditioned on all comments, and vice versa. This captures which sentences are discussed/questioned in comments and which comments address uncertain or controversial sentences.
Fake News Prediction. Learned representations from news sentences and user comments are concatenated; a softmax layer produces the final fake/real classification. The loss function is cross-entropy.
The model is trained end-to-end via backpropagation with RMSprop optimization.
Results¶
Classification Performance (Table 2):
| Method | PolitiFact Accuracy | GossipCop Accuracy |
|---|---|---|
| RST (Rhetorical Structure Theory) | 0.607 | 0.551 |
| LIWC | 0.769 | 0.736 |
| text-CNN | 0.653 | 0.739 |
| HAN (Hierarchical Attention Network) | 0.837 | 0.742 |
| TCNN-URG | 0.712 | 0.715 |
| HPA-BLSTM | 0.846 | 0.753 |
| CSI (hybrid model) | 0.827 | 0.772 |
| dEFEND | 0.904 | 0.808 |
dEFEND outperforms all baselines by substantial margins: on PolitiFact, 4.5% absolute improvement over HPA-BLSTM (0.846 → 0.904) and 7.7% over CSI; on GossipCop, 5.5% over HPA-BLSTM and 3.6% over CSI.
Ablation Studies: Removing sentence-comment co-attention (dEFEND_{NC}) degrades F1 by 4.25% on PolitiFact and 18.25% on GossipCop, showing the critical importance of jointly modeling news and comments. Removing user comments entirely (dEFEND_{N}) reduces accuracy even further, confirming that comment features are complementary to content features.
Explainability Evaluation (Human Study via Amazon Mechanical Turk): - Top-k explainable sentences: Compared against HPA-BLSTM, dEFEND ranks check-worthy sentences higher. Using MAP@5 and MAP@10 metrics, dEFEND achieves MAP@5 ≈ 0.85–0.90 vs. HPA-BLSTM ≈ 0.65–0.75 depending on dataset. - Top-k explainable comments: dEFEND correctly identifies comments that users rated as explaining why the news is fake. In worker-level evaluations (5 workers per article), dEFEND selects comments with Worker Ratio (WR=0.65) that are rated more explainable than HPA-BLSTM (WR=0.64).
Connections¶
- Directly builds on Shu et al. (2019) — User Profiles, a complementary work from the same authors that uses social-context features rather than content/comments.
- Contributes to the explainable detection literature by showing that hierarchical attention mechanisms can surface interpretable sentence-level and comment-level explanations without post-hoc analysis.
- Related to hierarchical attention networks and their application to document classification.
- Uses the FakeNewsNet dataset, the same benchmark as Zhou et al. (2020) — SAFE.
- Demonstrates the value of user comments as a detection signal, complementary to content features.
Notes¶
The co-attention mechanism is a key design choice and shows the largest performance drop in ablations, particularly on GossipCop (18% F1 loss when removed). This indicates that the dataset's user comment dynamics are rich and informative for detecting fake news.
The explainability evaluation is limited to top-k sentence and comment ranking; the paper does not provide confidence scores or probability distributions over explanations. The human evaluation procedure (worker voting on explainability scores 0–4) is somewhat coarse, and inter-rater agreement (e.g., Fleiss' kappa) is not reported.
The method assumes comments exist for all news items; the paper does not discuss robustness to sparse or missing comments in other social media platforms or early-detection scenarios where comments have not yet arrived.