KAN: Knowledge-aware Attention Network for Fake News Detection¶
Authors: Yaqian Dun, Kefei Tu, Chen Chen, Chunyan Hou, Xiaojie Yuan
Affiliations: College of Computer Science, Nankai University; Tianjin Key Laboratory of Network and Data Security Technology; School of Computer Science and Engineering, Tianjin University of Technology
Venue: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021
TL;DR¶
Most fake news detection models rely on textual features and social context but ignore knowledge-level relationships among entities in news. This paper proposes KAN, which extracts entity mentions from news, aligns them with knowledge graphs (Wikidata), and uses two attention mechanisms (N-E and N-E2C) to measure entity and entity-context importance. KAN achieves 7.4% F1 improvement over prior methods on PolitiFact, 2.8% on GossipCop, and 9.7% on PHEME.
Contributions¶
- Incorporates entities and their entity contexts (neighbors in knowledge graphs) as external knowledge for fake news detection—addressing the limitation that existing methods ignore knowledge-level entity relationships.
- Proposes Knowledge-aware Attention Network with two attention mechanisms:
- N-E attention: Measures semantic similarity between news content and entities to assign importance weights.
- N-E2C attention: Assigns importance weights to entity contexts based on the vitality of their corresponding entities.
- Demonstrates through ablation studies that both knowledge components and attention mechanisms are critical to detection performance.
Method¶
Knowledge Extraction: Uses entity linking (TagMe tool) to identify entity mentions in news and align them with Wikidata. For each linked entity, extracts its immediate neighbors (one-hop distance) as entity context.
Architecture: - Text Encoder: Transformer encoder with positional encoding to generate news representation p. - Knowledge Encoder: Separate transformer encoders for entity embeddings (from word2vec) and entity context embeddings to produce intermediate encodings q' and r'. - Attention Mechanisms: - N-E attention computes attention weights α between news and entities to produce weighted entity representation q. - N-E2C attention uses news and entity representations to weight entity contexts and produce representation r. - Classifier: Concatenates p, q, and r, feeds into a fully-connected layer with softmax and L2 regularization.
Results¶
Experiments on three benchmark datasets:
| Dataset | Metric | KAN | KCNN | B-TransE | GRU-2 |
|---|---|---|---|---|---|
| PolitiFact | F1 | 0.8539 | 0.7804 | 0.7641 | 0.7041 |
| PolitiFact | Accuracy | 0.8586 | 0.7827 | 0.7694 | 0.7109 |
| GossipCop | F1 | 0.7713 | 0.7433 | 0.7340 | 0.7079 |
| GossipCop | Accuracy | 0.7766 | 0.7491 | 0.7394 | 0.7180 |
| PHEME | F1 | 0.7461 | 0.6489 | 0.6074 | 0.6917 |
| PHEME | Accuracy | 0.7830 | 0.7265 | 0.7200 | 0.7371 |
Ablations reveal: - Removing entity contexts (KAN\EC) degrades performance, confirming their value. - Removing entities entirely (KAN\E) shows entities are crucial for disambiguation. - Removing all external knowledge (KAN\EC\E) reduces F1 by 2.2% on PolitiFact, 1.2% on GossipCop, and 1.3% on PHEME. - N-E and N-E2C attention mechanisms improve performance by 2.2% accuracy on PolitiFact and 6.2% on GossipCop when used together.
Connections¶
- Related to Knowledge graphs for entity linking and context extraction.
- Builds on Content-based fake news detection by enriching news representations with external knowledge.
- Uses attention mechanisms similar to those in Multimodal fake news detection for feature fusion.
- Contrasts with Propagation-based fake news detection which focuses on social context rather than knowledge graphs.
- Comparable to EANN (Wang et al. 2018) in using attention for multi-modal fusion, but operates on knowledge rather than image-text pairs.
- Extends the knowledge graph application in DEAP-FAKED by incorporating both entities and their contexts with attention weighting.
Notes¶
Strengths: - Novel and well-motivated use of entity contexts from knowledge graphs—entities rarely appear in isolation; their neighbors provide disambiguating context. - Thorough ablation studies demonstrate each component contributes meaningfully. - Strong empirical results across three diverse datasets (politics, entertainment, Twitter events). - Clear architectural design with interpretable attention weights.
Limitations: - Entity linking quality depends on the TagMe tool; errors propagate downstream. The paper does not report linking accuracy or analyze failure modes. - Knowledge graph coverage bias not discussed—rare entities may have sparse contexts or be absent from Wikidata. - Comparison to other knowledge-aware methods (e.g., B-TransE) is included, but limited discussion of why KAN's attention design outperforms them. - The entity context representation uses simple averaging of neighbor embeddings; richer encoding of multi-hop paths or relation types might improve performance. - Evaluation limited to supervised settings; generalization to out-of-domain news without retraining is unclear.