Skip to content
Anatomy of an online misinformation network

Anatomy of an online misinformation network

Authors: Chengcheng Shao, Pik-Mai Hui, Lei Wang, Xinwen Jiang, Alessandro Flammini, Filippo Menczer, Giovanni Luca Ciampaglia
Venue: arXiv, 2018 — arxiv:1801.06122

TL;DR

A network analysis of misinformation and fact-checking spread during the 2016 US Presidential Election reveals stark segregation between the two communities on Twitter. The core of the network is dominated by misinformation spreaders; fact-checking nearly disappears in the densest regions. The authors show that penalizing high out-strength nodes (accounts spreading primarily low-credibility content) effectively reduces misinformation circulation.

Contributions

  • First in-depth analysis of the network structure of large-scale online misinformation diffusion, integrating both claims and fact-checks
  • Description and open release of Hoaxy, a platform for tracking misinformation and fact-checking competition on Twitter
  • Characterization of the core of the misinformation diffusion network using k-core decomposition, revealing different user roles across network layers
  • Network disruption analysis: identifying the most efficient strategies to reduce misinformation spread by targeting central nodes

Method

The authors built Hoaxy, a platform that monitors Twitter in real-time, filters tweets for links to unverified claims or fact-checking articles, and stores the data in a searchable database. They performed k-core decomposition on the retweet network to identify nested "shells" of decreasing connectivity. For each node, they computed:

  • Fact-checking ratio (\(\rho_f\)): ratio of fact-checking edges to total edge strength
  • Retweet ratio (\(\rho_m\)): fraction of incoming edges from fact-checking vs. claim sources

They also applied Botometer to detect automated accounts and ranked nodes by multiple centrality measures (in-strength \(s_{in}\), out-strength \(s_{out}\), betweenness, PageRank). Network robustness was assessed by simulating node removal and measuring impact on retweet and claim link circulation.

Results

Network segregation: The network exhibits strong community structure, with a clear separation between fact-checking and misinformation spreaders. As k-core shell number increases (moving toward the denser center), fact-checking content nearly disappears. Most users occupy one of four archetypal roles: primary claim spreader, secondary spreader, fact-checker, or partisan supporter.

Core dynamics: The main core reaches ~800 accounts around Election Day (Nov. 8, 2016) and stabilizes thereafter. The core is inhabited primarily by accounts sharing misinformation, with heavy presence of bots and highly partisan behavior. Different centrality metrics rank different users, but high-ranking accounts across metrics consistently show strong partisan slant and low interest in accurate information.

Fact-checking within the core: Only 27.3% of sampled claims in the database are verified; fact-checking content is retweeted at 1:17 ratio with misinformation. Even within the main core, some fact-checking is shared, but often by accounts with primarily misinfo-spreading activity (e.g., mocking or attacking fact-checkers).

Network disruption: A greedy strategy that sequentially removes accounts with highest out-strength is most effective at reducing claim retweets and unique claim links. Removing as few as 10 influential spreaders can reduce circulation significantly. Disconnecting accounts with highest in-strength is far less effective, suggesting the network is robust to removal of consumers but vulnerable to removal of amplifiers.

Connections

Notes

Strengths: The paper uses 100% of available tweets (not a sample), enabling complete network reconstruction. The platform is open and reproducible. The multi-pronged analysis (structure, dynamics, centrality measures, robustness) is thorough and well-motivated. The distinction between different user archetypes is insightful.

Limitations: Hoaxy tracks only a fixed list of claim sources and fact-checking organizations, so coverage is incomplete. The analysis focuses on English-language Twitter content from US-based organizations. The claim verification ground truth relies on curated lists, which may introduce bias toward prolific sources. The paper does not examine misinformation spread via mainstream media, which likely reaches more users than social networks alone.

Follow-up questions: How do these network patterns change across different topics? Do the proposed disruption strategies transfer to other misinformation domains (health, climate)? What role do algorithmic recommendations play in shaping the core?