FakeNewsNet¶
FakeNewsNet is a benchmark data repository for fake news research, combining news content with rich social context. News labels are annotated by professional fact-checkers from two platforms:
- PolitiFact: US political news; 361 fake + 361 real news items with ~159k users and ~271k sharing events (before bot filtering).
- GossipCop: Entertainment/celebrity news; 4,513 fake + 4,513 real news items with ~210k users and ~812k sharing events.
Each news item includes article text and social context: user posting and sharing events on Twitter, enabling both content-based and social-context-based detection experiments. The dataset is dynamic — social context accumulates over time.
Papers in this wiki that use FakeNewsNet¶
- Shu et al. (2019) — The Role of User Profiles for Fake News Detection
- Zhou et al. (2020) — SAFE: Similarity-Aware Multi-Modal Fake News Detection
- Zhou et al. (2020) — Fake News Early Detection: An Interdisciplinary Study (old-version: PolitiFact 240 articles, BuzzFeed 180 articles)
- A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles — Ensemble of pre-trained language models evaluated on FakeNewsNet dataset; achieves F1=0.9156 by combining soft voting, Statistical Feature Fusion Network, and heuristic post-processing
Notes¶
Bot accounts constitute 14–21% of users depending on partition; Botometer is commonly applied to filter them before profile-based analysis. The dataset has a class-balanced design (equal fake and real news counts per platform), which simplifies evaluation but may not reflect real-world distributions.