Skip to content
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection

r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection

Authors: Kai Nakamura, Sharon Levy, William Yang Wang
Venue: arXiv — 1911.03854

TL;DR

Fakeddit is a large-scale multimodal fake news detection dataset with over 1 million Reddit samples labeled with 2-way, 3-way, and 6-way classification schemes. Multimodal models combining text (BERT) and image (ResNet50) features achieve 86.0% accuracy on 6-way classification, demonstrating the importance of jointly leveraging text and image information for fine-grained fake news categorization.

Contributions

  • Large-scale multimodal dataset: over 1 million Reddit submissions sourced from 22 subreddits, with 64% multimodal (text+image) coverage, spanning 2008–2019.
  • Fine-grained labeling: samples labeled for 2-way (fake/true), 3-way (true/misleading true/false), and 6-way (true, satire/parody, misleading content, imposter content, false connection, manipulated content) classification via distant supervision.
  • Comprehensive baseline experiments: evaluation of text-only (BERT, InferSent), image-only (VGG16, EfficientNet, ResNet50), and multimodal fusion methods; "maximum" feature fusion yields best results.
  • Error analysis: identifies satire and imposter content as the hardest categories; model frequently misclassifies as true due to class imbalance in 6-way setting.

Dataset Overview

Scale and composition: - Total samples: 1,063,106 (628,501 fake, 527,049 true) - Multimodal samples: 682,996 (64%) - Subreddits: 22 across true, satirical, misleading, manipulated, imposter, and false-connection categories - Unique users: 358,504 - Unique domains: 24,203 - Timespan: March 19, 2008 – October 24, 2019

Quality assurance: - Subreddit moderation as first-pass filtering - Score threshold (≥1) to filter downvoted content - Manual sampling (10 posts per subreddit) to validate thematic consistency - Text cleaning: punctuation/numbers removed, revealing keywords stripped, lowercase conversion

Method

Data labeling: Distant supervision via subreddit theme assignment. Samples are not individually annotated; instead, each subreddit is labeled once (2-way, 3-way, 6-way) and all posts within inherit that label. Cohen's Kappa = 0.54 on manual 6-way labeling of 150 text-image pairs, indicating moderate agreement and overlap between categories (e.g., false connection / misleading content).

Feature extraction: - Text: BERT-Large (Uncased) mapped to 768-dim vectors; InferSent (4096-dim) with fastText embeddings as baseline. - Image: ResNet50, VGG16, EfficientNet-B4 (ImageNet pretrained, penultimate layer for feature extraction). - Fusion: Multimodal features condensed to n-element vectors via trainable dense layer, then merged via add, concatenate, maximum, or average operations.

Experimental setup: - Train/validation/test split: 878,218 / 92,444 / 92,444 - Hyperparameter tuning via Hyperband (hidden layer size 32–256, learning rate {1e-2, 1e-3, 1e-4}) - Optimizer: Adam; max 20 epochs, early stopping on validation accuracy

Results

2-way classification (fake/true): - Best (BERT+ResNet50, maximum fusion): 89.29% validation, 89.09% test - Multimodal >> text-only: 88.29% (BERT) vs 89.29%

3-way classification (true/misleading-true/false): - Best (BERT+ResNet50, maximum): 89.05% validation, 88.90% test - Consistent multimodal advantage

6-way classification (fine-grained): - Best (BERT+ResNet50, maximum): 86.00% validation, 85.88% test - Text-only BERT: 76.96% → multimodal improves by 9–10 percentage points - Image-only (ResNet50): 75.29% (poor—visual features alone insufficient)

Combination method comparison (BERT+ResNet50): - Maximum fusion: 85.88% (6-way test) - Concatenate: 82.49% - Add: 82.35% - Average: 82.42% Maximum substantially outperforms element-wise operations.

Error Analysis

Misclassification patterns on 6-way:

  • Hardest categories: Imposter content (bot-generated posts) and satire (models confused by surface-level similarity to true content when lacking cultural context).
  • Easiest category: Manipulated content (models correctly classify nearly all).
  • Bias to true: Misclassified samples frequently predicted as true; attributed to class imbalance (true samples comprise majority for 6-way split of fake category).

Example errors: - "nascar race stops to wait for family of ducks to pass" → predicted True, gold Satire (32.8% misclassification rate within satire class) - "bear experiences getting hit in the cinema rule, your child again" → predicted Satire, gold Imposter (55.7% misclassification rate within imposter class)

Connections

  • Compared against text-only datasets (LIAR, FEVER, FakeNewsCorpus): Fakeddit uniquely combines scale (>1M samples), multimodality (64% text+image), and fine-grained labels (6-way).
  • Related to EANN and MVAE via text+image fusion; Fakeddit focuses on dataset contribution rather than novel architecture.
  • Shares Reddit sourcing with Anatomy of Misinformation but extends scale and multimodality dramatically.
  • Precursor to later Multimodal fake news detection work leveraging joint text-image understanding for classification.

Notes

Strengths: - Unprecedented scale for multimodal fake news data at time of publication (2019); dataset represents significant resource contribution. - Distant supervision pragmatic: subreddit themes provide weak labels without expensive manual annotation at scale. - Comprehensive experimental design spanning text, image, and fusion modalities; clear demonstration of multimodal advantage. - Fine-grained 6-way labels enable nuanced detection research (satire vs. manipulated vs. false connection). - Long timespan (decade) captures linguistic/cultural drift; diverse subreddit sources prevent domain bias.

Limitations: - Distant supervision introduces noise: Cohen's Kappa = 0.54 (moderate agreement) suggests category boundaries are genuinely fuzzy; some samples may be mislabeled or multiply interpretable. - Class imbalance in 6-way setting: biases models toward true class. While true/fake nearly balanced at 2-way level, 6-way breakdown is skewed. - Satire and imposter content remain hard (6-way accuracy ~73% and ~52% respectively); baseline models struggle on these nuanced categories, indicating room for improvement in fine-grained classification. - Evaluation excludes metadata and comments provided in dataset; paper states "we do not utilize submission metadata and comments" but does not explore potential downstream impact. - Generalization to platforms beyond Reddit unclear; Reddit's voting/moderation dynamics may not reflect dynamics on Twitter, TikTok, or other platforms.

Follow-ups: - Investigate learned fusion mechanisms (attention, gating) vs. simple maximum fusion. - Explore few-shot or zero-shot approaches for satire/imposter, where evidence is subtle and contextual. - Leverage comment data and metadata for richer feature representations. - Extend to video + text + image for TikTok/YouTube-style misinformation.