Weibo21¶
Full name: Weibo21: Multi-Domain Fake News Detection Dataset
Authors: Qiong Nan, Juan Cao, Yongchun Zhu, Yanyan Wang, Jintao Li
Paper: Nan et al. (2021) — MDFEND: Multi-domain Fake News Detection
Access: https://github.com/kennqiang/MDFEND-Weibo21 (dataset and code)
Description¶
Weibo21 is the first multi-domain fake news detection dataset collected from a single platform (Sina Weibo) with explicit domain annotations. It addresses a critical gap in fake news research: while most prior datasets focus on single-domain detection, real-world news streams cover diverse topics with distinct linguistic patterns and propagation behaviors. Domain shift—differences in word usage and information cascades across topics—is a core challenge for deploying detection systems in practice. Weibo21 enables systematic evaluation of multi-domain detection methods and cross-domain transfer learning.
Statistics¶
Overall dataset: - Total items: 9,128 (4,488 fake, 4,640 real) - Balance: approximately 1:1 real to fake - Collection period: December 2014 to March 2021 - Domains: 9
Per-domain breakdown:
| Domain | Real | Fake | Total |
|---|---|---|---|
| Science | 143 | 93 | 236 |
| Military | 121 | 222 | 343 |
| Education | 243 | 248 | 491 |
| Disasters | 185 | 591 | 776 |
| Politics | 306 | 546 | 852 |
| Health | 485 | 515 | 1,000 |
| Finance | 959 | 362 | 1,321 |
| Entertainment | 1,000 | 440 | 1,440 |
| Society | 1,198 | 1,471 | 2,669 |
| All | 4,640 | 4,488 | 9,128 |
Class distribution is relatively balanced overall but varies by domain: Politics and Disasters are fake-heavy, while Entertainment and Finance are real-heavy.
Schema¶
Each Weibo microblog record contains:
| Field | Description |
|---|---|
| id | Weibo microblog identifier |
| domain | One of: Science, Military, Education, Disasters, Politics, Health, Finance, Entertainment, Society |
| label | "real" or "fake" |
| text | Microblog content (Chinese) |
| images | Associated images (if any) |
| timestamp | Post timestamp |
| comments | User comments (list of objects) |
The dataset includes rich social context data (comments, timestamps) enabling both content-based and propagation-based detection research.
Data collection¶
Fake news: - Collected from Sina Weibo's official Community Management Center (Weibo辟谣), which verifies and labels misinformation reported by users. - Official expert evaluation text provided for each fake item.
Real news: - Verified news from NewsVerify, a Weibo-based platform focused on discovering and validating suspicious news claims. - Real news articles paired with fake news temporally and semantically to create a realistic mixed timeline.
Deduplication: - One-pass clustering applied to remove duplicates from raw collection. - Final deduplicated set: 4,488 fake and 4,640 real.
Domain annotation¶
Domain labels assigned via crowdsourcing by 10 expert annotators: 1. Each microblog independently labeled by all 10 experts 2. Labels reconciled; agreement threshold: >8 experts agreeing on the same label 3. Disagreements resolved via expert discussion
Nine domains chosen based on fact-checking websites (Zhuoyaoji, Liuyanbaike, Jiaozhen, Ruijianshiyao) and prior misinformation research reports (Vosoughi et al., Tencent Rumor Governance Report, China Joint Internet Rumor-Busting Platform).
Key domain characteristics¶
Domain-specific linguistic and propagation differences are quantified in the paper:
Word usage differences (word clouds): - Military: "navy" (海军), "army" (陆军) - Health: "patients" (患者), "hospital" (医院) - Education: "students" (学生), "university" (大学), "teacher" (教师) - Finance: "stock market" (股市), "investment" (投资)
Propagation pattern differences: - Disaster news spreads faster and wider - Political news shows high engagement - Entertainment news exhibits different bot participation rates
These domain differences drive the fundamental motivation for multi-domain detection: a model trained on Politics fails on Entertainment, and vice versa.
Benchmark results¶
Evaluation results from MDFEND paper (F₁-scores) comparing single-domain, mixed-domain, and multi-domain methods:
| Method | Science | Military | Education | Disasters | Politics | Health | Finance | Entertainment | Society | All |
|---|---|---|---|---|---|---|---|---|---|---|
| BERT_single | 0.8192 | 0.7795 | 0.8136 | 0.7885 | 0.8188 | 0.8909 | 0.8464 | 0.8638 | 0.8242 | 0.8272 |
| TextCNN_all | 0.7254 | 0.8839 | 0.8362 | 0.8222 | 0.8561 | 0.8768 | 0.8638 | 0.8456 | 0.8540 | 0.8686 |
| EANN | 0.8225 | 0.9274 | 0.8624 | 0.8666 | 0.8705 | 0.9150 | 0.8710 | 0.8957 | 0.8877 | 0.8975 |
| MMOE | 0.8755 | 0.9112 | 0.8706 | 0.8770 | 0.8620 | 0.9364 | 0.8567 | 0.8886 | 0.8750 | 0.8947 |
| EDDFN | 0.8186 | 0.9137 | 0.8676 | 0.8786 | 0.8478 | 0.9379 | 0.8636 | 0.8832 | 0.8689 | 0.8919 |
| MDFEND | 0.8301 | 0.9389 | 0.8917 | 0.9003 | 0.8865 | 0.9400 | 0.8951 | 0.9066 | 0.8980 | 0.9137 |
Key findings: - Single-domain models perform poorly across domains (macro average 0.8272) - Multi-domain models substantially outperform mixed-domain (0.8795) - MDFEND achieves best overall F₁ of 0.9137, +1.62% above second-best baseline (MMOE at 0.8947) - Performance gains across all domains; strongest improvements in Disasters (0.9003) and Health (0.9400)
Intended use¶
- Multi-domain fake news detection: Benchmark for evaluating methods that handle domain shift.
- Transfer learning: Evaluate models' ability to generalize across different news domains.
- Cross-domain generalization: Study how models trained on one domain transfer to others.
- Domain adaptation: Research fine-tuning strategies for new domains with limited labeled data.
- Chinese social media analysis: Rich dataset for studying misinformation on Weibo specifically.
- Propagation analysis: Propagation graphs and temporal data enable study of information spread patterns by domain.
Limitations¶
- Single platform: Dataset limited to Sina Weibo; generalization to Twitter, Facebook, or other platforms unknown.
- Language: Chinese-only; monolingual evaluation cannot assess cross-lingual transfer.
- Domain categories: Nine domains are coarse-grained; fine-grained categorization (e.g., vaccine misinformation vs. general health) might reveal additional domain shift challenges.
- Data size: 9,128 total items is relatively small compared to English benchmarks (FakeNewsNet: 35K+ articles, NELA-GT-2022: 1.78M articles); multimodal approaches are underdeveloped.
- Temporal bias: Collection period (Dec 2014–Mar 2021) spans different eras of platform policies and user behavior; no controlled temporal split provided.
- Expert annotation: Domain labels assigned by 10 experts with majority voting; potential disagreement on boundary cases not quantified.
Connections¶
- FakeNewsNet is the dominant multi-dimensional benchmark for English; Weibo21's multi-domain structure complements FakeNewsNet's multi-source richness.
- CHECKED is another Chinese Weibo dataset but focuses exclusively on COVID-19; Weibo21 is broader in scope.
- Nan et al. (2021) introduces Weibo21 and proposes MDFEND.
- Silva et al. (2021) addresses similar cross-domain challenges for multimodal detection on English datasets.
- Wang et al. (2018) — EANN pioneered event-invariant learning; MDFEND extends the idea to explicit domain-aware gating.