Rumor Detection on Twitter with Tree-structured Recursive Neural Networks¶
Authors: Jing Ma, Wei Gao, Kam-Fai Wong
Venue: ACL 2018 (56th Annual Meeting of the Association for Computational Linguistics), Melbourne, Australia
TL;DR¶
Proposes two variants of recursive neural networks (RvNN)—bottom-up and top-down—that operate on propagation tree structures of tweets to detect rumors. Rather than treating tweets sequentially, the model learns representations that respect how tweets actually propagate in response threads. Results on two public Twitter datasets show superior performance over state-of-the-art baselines, with the top-down model achieving 72.3% and 73.7% accuracy on Twitter15 and Twitter16 respectively, and demonstrating strong early-stage detection capabilities.
Contributions¶
- First deep neural approach that jointly integrates both propagation structure and content semantics using tree-structured recursive neural networks for rumor detection.
- Two RvNN architectures (bottom-up and top-down) that model different aspects of propagation: bottom-up aggregates signals from leaves to root (like citation networks), while top-down propagates signals from root to leaves (matching information diffusion).
- Empirical demonstration that propagation structure provides critical signals for rumor classification—the recursive models achieve large improvements over prior methods and detect rumors 4× faster at comparable accuracy (8 hours / 90 tweets vs. 36 hours / 300 posts).
Method¶
The paper models a rumor as a tree structure where the root is the source tweet and each node is a responsive tweet. Edges represent reply relationships. Each tweet is represented as a TF-IDF vector.
Bottom-up RvNN: Recursively computes node representations from leaves upward. Each node j aggregates information from its children using gated recurrent units (GRU):
- Sum child hidden states: \(h_S = \sum_{s \in S(j)} h_s\)
- Reset gate: \(r_j = \sigma(W_r \tilde{x}_j + U_r h_S)\)
- Update gate: \(z_j = \sigma(W_z \tilde{x}_j + U_z h_S)\)
- Candidate activation: \(\tilde{h}_j = \tanh(W_h \tilde{x}_j + U_h(h_S \odot r_j))\)
- Final: \(h_j = (1 - z_j) \odot h_S + z_j \odot \tilde{h}_j\)
The root node's hidden state becomes the representation for the entire tree, fed to a softmax layer for classification.
Top-down RvNN: Propagates information from root downward. Each node computes its representation by combining its own input with its parent's hidden state:
- Reset gate: \(r_j = \sigma(W_r \tilde{x}_j + U_r h_{P(j)})\)
- Update gate: \(z_j = \sigma(W_z \tilde{x}_j + U_z h_{P(j)})\)
- Candidate: \(\tilde{h}_j = \tanh(W_h \tilde{x}_j + U_h(h_{P(j)} \odot r_j))\)
- Final: \(h_j = (1 - z_j) \odot h_{P(j)} + z_j \odot \tilde{h}_j\)
Final representations are embedded in leaf node vectors. Since tree leaves vary in number, a max-pooling layer aggregates across all leaves to form a fixed-size input to the final softmax classifier.
Intuition: When a post denies a false rumor, supportive replies to that denial reinforce the denial's stance. Conversely, denials of true rumors trigger questioning. The recursive structure naturally captures these local patterns and aggregates them across the tree.
Results¶
Rumor Classification (4-way: non-rumor, false rumor, true rumor, unverified rumor):
| Model | Twitter15 Acc | Twitter16 Acc |
|---|---|---|
| DTC (Decision-Tree) | 45.4% | 46.5% |
| RFC (Random Forest) | 56.5% | 58.5% |
| SVM-TS (time-series features) | 54.4% | 57.4% |
| SVM-TK (tree kernel) | 66.7% | 66.2% |
| GRU-RNN (sequential) | 64.1% | 63.3% |
| BU-RvNN | 70.8% | 71.8% |
| TD-RvNN | 72.3% | 73.7% |
Top-down RvNN outperforms the best baseline (SVM-TK) by 5.6 percentage points on Twitter15 and 7.5 on Twitter16.
Early Detection Performance:
Both RvNN models converge to best-baseline accuracy (~66% for SVM-TK) in just 8 hours elapsed time or ~90 tweets, whereas SVM-TK requires ~36 hours or ~300 posts—a 4× speed improvement in wall-clock time.
Per-class F₁ scores (Twitter16):
| Method | Non-rumor | False | True | Unverified |
|---|---|---|---|---|
| TD-RvNN | 0.662 | 0.743 | 0.835 | 0.708 |
The model performs weakest on non-rumors, likely because responses are more diverse with less discriminative signal; best on unverified rumors where propagation patterns are most informative.
Connections¶
- Related to Information Credibility on Twitter via shared use of propagation structure for credibility assessment, though Ma et al. use learned representations while Castillo et al. use engineered features.
- Extends propagation-based detection methods beyond hand-crafted kernels (e.g., tree kernels) to learned neural representations.
- Compared to sequential GRU models (Ma et al. 2016, 2018); shows tree structure captures signals that sequential ordering misses.
- Foundational work in tree-structured neural network architectures applied to social media; related to recursive neural networks used in NLP for parsing and sentiment analysis.
Notes¶
Strengths: - Natural architectural fit: trees are how social media propagation naturally manifests. Unlike sequential models that impose arbitrary ordering, RvNN respects the actual thread structure. - Strong empirical results with substantial improvements over prior work and demonstrable early-detection advantage—critical for real-time intervention. - Two complementary architectures (bottom-up vs. top-down) with clear intuition and measurable trade-offs. Top-down's superiority aligns with the hypothesis that pooling across leaves captures more information than relying on a single root representation. - Clear paper writing; method section is technical but well-explained.
Weaknesses: - Relatively weak on non-rumor classification, likely due to sparse discriminative signals in varied responses. The method is better suited to separating true vs. false claims than identifying rumors overall. - Evaluation limited to Twitter; generalization to other platforms (Reddit, Facebook, WhatsApp) unknown, and those platforms may have different propagation structures. - No user profile features, which prior work (e.g., Castillo et al.) showed helpful. Ma et al. acknowledge this as future work. - Datasets relatively small (1,381 and 1,181 trees) by modern standards. Unclear if gains hold at larger scale or if overfitting is a concern.
Fit for the wiki: Strong foundational paper in propagation-based detection. Bridges content-based and structure-based approaches; demonstrates that graph structure provides critical signals independent of text. Published in a top venue (ACL). Highly cited in the rumor detection literature.