Rumor detection on social media¶
Rumor detection is the task of automatically identifying claims that emerge on social media as true, false, unverified, or non-rumors. This complements fake news detection (typically focused on articles from news outlets) by targeting the shorter-form, conversation-driven claims on Twitter, Weibo, Reddit, and other platforms where false information spreads rapidly.
Key signals for rumor detection¶
Rumor detection literature emphasizes multiple complementary signals:
- Propagation structure: How the claim spreads through retweets and replies contains signals about veracity. Ma et al. (2018) show that recursive patterns in thread responses—e.g., supportive replies to denials, questioning replies to affirmations—reveal the underlying truth value.
- Temporal dynamics: Early retweet velocity and acceleration patterns differ between true and false rumors.
- User profiles: The reputation, follower count, and verification status of early spreaders matter; false rumors often originate from less-established accounts.
- Stance signals: User comments and replies express explicit stances (support, deny, question, comment) toward the rumor.
- Content features: Linguistic features, sentiment, and emotional language in the rumor itself.
Key papers in this wiki¶
- Kumar & Carley (2019) — Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations: Proposes Tree LSTM and Binarized Constituency Tree (BCTree) LSTM architectures with convolution units for stance and veracity classification; achieves 0.520 mean F1 for stance and 0.379 for veracity on PHEME dataset, outperforming prior work by 12% and 15% respectively; uses multi-task learning to jointly optimize both tasks.
- Lu & Li (2020) — GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection: Models retweet propagation sequences using GCN and dual co-attention; achieves 87.7% accuracy on Twitter15 and 90.8% on Twitter16; demonstrates that graph-based representations of user interactions (modeled as fully-connected graphs with cosine-similarity edge weights) capture veracity signals; provides interpretable explanations highlighting suspicious user types and informative words.
- Bian et al. (2020) — Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks: first GCN-based rumor detection combining top-down propagation patterns (TD-GCN) and bottom-up dispersion patterns (BU-GCN); source-post feature enhancement amplifies early information; achieves 96.1% accuracy on Weibo, 88.6% on Twitter15, 88.0% on Twitter16, with strong early-detection performance.
- Ma et al. (2017) — Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning: proposes Propagation Tree Kernel (PTK) and context-sensitive variant (cPTK) to detect rumors by measuring structural similarity between propagation trees; extends to finer-grained four-class classification (false/true/unverified/non-rumor); achieves 75% accuracy on Twitter15 with superior early detection (75% accuracy within 24 hours).
- Ma et al. (2018) — Rumor Detection on Twitter with Tree-structured Recursive Neural Networks: proposes bottom-up and top-down recursive neural networks operating on propagation trees; TD-RvNN achieves 72.3% / 73.7% accuracy on Twitter15/16, with superior early detection (8 hours vs. 36 hours to match baseline performance).
- Castillo et al. (2011) — Information Credibility on Twitter: foundational work on Twitter credibility via propagation structure and user reputation; shows retweet tree depth/breadth are strong signals.
- Vosoughi et al. (2017) — The Spread of True and False News Online: empirical foundation showing false news spreads 6× faster and deeper than truth.
Rumor vs. fake news¶
| Aspect | Rumor | Fake News |
|---|---|---|
| Format | Short claim, often unverified at origin | Full article, often mimics news style |
| Platform | Twitter, Weibo, Reddit (social media) | News websites, articles, sometimes Facebook |
| Veracity | Often genuinely unverified; truth emerges through discussion | Usually deliberately false |
| Propagation | Thread-like with discussion and stance expressions | Cascading shares with less discussion |
| Detection approach | Leverage propagation structure + stance signals | Content analysis + source credibility |
Connections¶
- Propagation-based detection: rumor detection makes heavy use of structural signals.
- Social-context-based detection: user profiles and engagement patterns are key signals.
- Stance detection: explicit user stances (support, deny, query) in replies are crucial.
- Twitter: primary platform for rumor research.
- Fake news detection: overlapping but distinct task.