MDFEND: Multi-domain Fake News Detection¶

Authors: Qiong Nan, Juan Cao, Yongchun Zhu, Yanyan Wang, Jintao Li
Venue: CIKM '21, November 1–5, 2021, Virtual Event, QLD, Australia — DOI

TL;DR¶

Most fake news detection methods focus on single-domain data, which leads to poor performance across different domains due to domain shift (varying word usage and propagation patterns). This paper introduces Weibo21, the first multi-domain fake news dataset from a single platform with 9 domains, and proposes MDFEND, a model using mixture-of-experts with a domain gate to adaptively aggregate domain-specific representations and achieve state-of-the-art multi-domain detection performance.

Contributions¶

Weibo21 Dataset: First multi-domain fake news detection dataset collected from a single platform (Sina Weibo) with domain labels. Contains 4,488 fake news and 4,640 real news from 9 domains: Science, Military, Education, Disasters, Politics, Health, Finance, Entertainment, and Society. Each item includes news content, timestamp, images, and comments.
MDFEND Model: Multi-domain fake news detection framework addressing domain shift via:
Multiple expert networks (mixture-of-experts) extracting diverse representations
Domain gate mechanism using domain embedding and sentence embedding to adaptively weight expert outputs
Binary classification with shared feature space across domains
Systematic Evaluation: Comprehensive comparison of single-domain, mixed-domain, and multi-domain baselines on Weibo21, demonstrating the effectiveness of domain-aware learning.

Method¶

Representation Extraction: - Input: News content tokenized with BERT, producing word embeddings W - Mask-Attention network extracts sentence-level embedding e_s - Learnable domain embedding e_d for each domain - T expert networks (TextCNN-based) independently extract representations: r_i = Ψ_i(W; θ_i)

Domain Gate: - Takes concatenation of domain embedding e_d and sentence embedding e_s - Feed-forward network G with softmax outputs weight vector a ∈ ℝ^T - Final feature vector: v = Σ(a_i × r_i) for i=1 to T - Weighting adaptively balances expert contributions based on domain and content

Prediction: - Multi-layer perceptron with softmax for binary fake/real classification - Binary Cross-Entropy Loss with balanced class weights

Results¶

F1-scores on Weibo21:

Model	Science	Military	Education	Disasters	Politics	Health	Finance	Entertainment	Society	All
BERT_single	0.8192	0.7795	0.8136	0.7885	0.8188	0.8909	0.8464	0.8638	0.8242	0.8272
BERT_all	0.7777	0.9072	0.8331	0.8512	0.8366	0.9090	0.8735	0.8769	0.8577	0.8795
EANN	0.8225	0.9274	0.8624	0.8666	0.8705	0.9150	0.8710	0.8957	0.8877	0.8975
MMOE	0.8755	0.9112	0.8706	0.8770	0.8620	0.9364	0.8567	0.8886	0.8750	0.8947
EDDFN	0.8186	0.9137	0.8676	0.8786	0.8478	0.9379	0.8636	0.8832	0.8689	0.8919
MDFEND	0.8301	0.9389	0.8917	0.9003	0.8865	0.9400	0.8951	0.9066	0.8980	0.9137

Key findings: - Multi-domain models substantially outperform single-domain and mixed-domain baselines - MDFEND achieves +1.62% F1 improvement over best baseline MMOE (0.9137 vs 0.8947) - Domain gate effectively leverages both domain and content signals - Performance gains across all domains, with largest improvements in disaster (0.9003) and health (0.9400) domains

Connections¶

Related to EANN via shared use of adversarial learning to extract domain-invariant features, though EANN targets event-invariance while MDFEND targets cross-domain transfer.
Extends cross-domain learning literature by applying mixture-of-experts gating mechanism to fake news detection.
Similar architectural choice as transfer learning methods that learn shared and domain-specific representations.
Complements Silva et al. (2021) which addresses multimodal cross-domain detection; MDFEND is text-only but systematically evaluates 9 domains on a single platform.
Addresses domain shift problem identified in content-based detection literature where linguistic and propagation patterns vary across domains.

Notes¶

Strengths: - First comprehensive multi-domain dataset from a single platform; enables controlled evaluation across domains without platform confounds. - Well-motivated problem: domain shift is a practical challenge for real-world deployment of detection systems. - Elegant solution using mixture-of-experts with domain gating; both theoretically motivated and empirically effective. - Thorough experimental design: evaluates single-domain, mixed-domain, and multi-domain baselines; ablation shows domain gate is essential. - Clear presentation of domain differences (word clouds, propagation patterns) demonstrating domain shift is real.

Limitations: - Evaluation limited to Chinese Weibo data; generalization to other platforms (Twitter, Facebook) or languages unknown. - Weibo21 is relatively small (9,128 total items); comparable English datasets (FakeNewsNet, FEVER) contain orders of magnitude more data. - Domain categories are coarse-grained; many real-world news domains (e.g., medical misinformation vs. general health news) might benefit from finer-grained distinctions. - Domain embeddings are learned jointly; unclear if pre-trained domain representations would improve transfer to new domains. - Mixture-of-experts requires knowing domain label at inference time; fully unsupervised domain discovery remains open. - Text-only approach ignores images and propagation networks available in Weibo dataset; multimodal extension could be valuable.

Follow-ups: - Apply MDFEND to other languages and platforms to establish generalizability. - Extend to truly zero-shot domain adaptation where new domains are never seen during training. - Combine with propagation-based detection methods using the rich social context data in Weibo21. - Investigate learned interactions between modalities (text + images) in Chinese social media. - Explore whether domain gate can dynamically discover domains without manual annotation.