Home

A curated reference for researchers working on fake news, misinformation, and disinformation — covering detection methods, propagation dynamics, and the underlying psychology and social science.

Maintained by the Syracuse University DataLab. Every claim on every page traces back to a primary source.

274 Papers 5 Articles 1,072 Authors 732 Topics 23 Datasets 0 Tools 2 Videos

Survey¶

Entry points for the field: - Cao et al. (2023) — A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT in J. ACM — foundational survey of generative AI covering history (GANs to transformers), technical foundations, unimodal and multimodal models, applications across domains, and critical trustworthiness concerns (factuality, security, privacy, fairness); essential context for understanding AI-enabled misinformation generation and detection. - Chang et al. (2023) — A Survey on Evaluation of Large Language Models in J. ACM — comprehensive survey on evaluation methodologies for LLMs across three dimensions (what, where, how to evaluate); encompasses 269 papers on natural language understanding, generation, reasoning, robustness, ethics, bias, factuality, trustworthiness, and domain-specific applications; essential for understanding how to assess LLMs used in misinformation detection and generation. - Liu et al. (2023) — Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models' Alignment — comprehensive survey on LLM trustworthiness across seven dimensions (reliability, safety, fairness, resistance to misuse, explainability, social norms, robustness) with 29 sub-categories; presents taxonomy, measurement studies on multiple LLMs, and case studies demonstrating effectiveness of alignment varies significantly across trustworthiness categories. - Hamborg, Donnay & Gipp (2018) — Automated identification of media bias in news articles: an interdisciplinary literature review in International Journal on Digital Libraries — bridges social science and computer science research on media bias detection; defines nine bias forms (event/source selection, labeling, placement, spin); maps manual analysis methods to computational approaches - Zhou, Xu, Trajcevski & Zhang (2021) — A Survey of Information Cascade Analysis: Models, Predictions, and Recent Advances in ACM Computing Surveys — comprehensive survey of cascade prediction covering 250+ papers; taxonomy of feature-based, generative, and deep learning approaches - Zhou & Zafarani (2020) — A Survey of Fake News in ACM Computing Surveys — concise, authoritative survey of detection methods and opportunities - Shu et al. (2020) — Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements — comprehensive book chapter covering user engagement, weak supervision approaches, and trending issues

Sources by type¶

See all papers, all articles (populated by ingest workflow)

Foundations¶

Kaddour et al. (2022) — Causal Machine Learning: A Survey and Open Problems — comprehensive 191-page survey of CausalML methods that formalize data generation as a structural causal model to enable reasoning about interventions and counterfactuals; taxonomizes 5 problem areas (causal supervised learning, generative modeling, explanations, fairness, reinforcement learning) with systematic comparison of methods and applications to computer vision, NLP, and graph learning; addresses robustness via invariant features, fairness via counterfactual constraints, and generalization across environments.
Brundage et al. (2018) — The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation — comprehensive threat analysis of AI-enabled attacks across digital, physical, and political security domains; identifies AI-specific threats including deepfakes, automated disinformation campaigns, and denial-of-information attacks; policy recommendations for mitigating malicious AI uses while enabling beneficial applications.
Bommasani et al. (2021) — On the Opportunities and Risks of Foundation Models — comprehensive 214-page Stanford CRFM report analyzing foundation models (large models trained on broad data and adapted to diverse tasks); examines capabilities across language, vision, and reasoning; applications in healthcare, law, education; and critical societal risks including misuse for misinformation/deepfakes, fairness harms, environmental costs, and security vulnerabilities; proposes frameworks for detection and mitigation.
Efficient Estimation of Word Representations in Vector Space — Efficient Continuous Bag-of-Words (CBOW) and Skip-gram architectures for learning high-quality word embeddings from large corpora in under a day; demonstrates that word vectors capture both syntactic and semantic regularities enabling vector arithmetic (king − man + woman ≈ queen); foundational technique widely adopted in NLP pipelines including fake news detection systems.
Distributed Representations of Words and Phrases and their Compositionality — Extends Skip-gram with phrase representations, negative sampling, and subsampling of frequent words; demonstrates that word vectors exhibit compositional structure via simple vector addition; achieves 72% accuracy on phrase analogy tasks, enabling embedding-based representation of multi-word units and idioms.
Vaswani et al. (2017) — Attention Is All You Need — introduces the Transformer architecture based entirely on self-attention mechanisms, replacing recurrence and convolution; proposes scaled dot-product and multi-head attention; achieves state-of-the-art BLEU scores on machine translation (28.4 on WMT14 En-De, 41.0 on En-Fr) with significantly faster training; demonstrates strong generalization to parsing and other NLP tasks; canonical architecture underlying BERT, GPT, and virtually all modern fake news detection models.
Liu et al. (2019) — RoBERTa: A Robustly Optimized BERT Pretraining Approach — systematic replication study showing BERT was undertrained; proposes RoBERTa with four key improvements: dynamic masking, removal of next-sentence-prediction loss, training on longer sequences, and larger mini-batches; achieves state-of-the-art on GLUE, RACE, and SQuAD benchmarks; became foundational pretrained language model for downstream fine-tuning across NLP applications including fake-news and misinformation detection systems.
Izacard et al. (2022) — Atlas: Few-shot Learning with Retrieval Augmented Language Models — jointly-trained retrieval-augmented model achieving strong few-shot performance on knowledge-intensive tasks with compact parameters; achieves 42.4% on NaturalQuestions and 80.1% on FEVER fact-checking with full data; demonstrates improved parameter efficiency and interpretability compared to dense-only language models; foundational for resource-efficient fact-checking and misinformation detection systems.
Waseem (2016) — Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter — empirical comparison of expert (feminist and anti-racism activists) vs. crowdsourced annotations on 6,909 tweets; demonstrates that systems trained on expert labels substantially outperform those on crowdworker data (F1 91.19 vs. 83.88); introduces intersectional annotation scheme capturing both racism and sexism simultaneously; foundational methodological work on data quality and annotator expertise in hate speech detection.
Pulastya et al. (2021) — Assessing the Quality of the Datasets by Identifying Mislabeled Samples — proposes AQUAVS, a supervised variational autoencoder with auxiliary discriminative network, to automatically identify mislabeled data points via outlier detection in latent space; demonstrates high-precision mislabel identification without requiring clean validation data or prior knowledge of noise type; shows significant accuracy improvements on downstream classification tasks after filtering identified mislabeled samples, especially on MNIST and CIFAR-10.
Zannettou et al. (2018) — The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans — comprehensive typology of false information ecosystem identifying eight types of false information, twelve actor categories, and six motives; surveys 200+ papers on user perception, propagation dynamics, detection, containment, and political misinformation.
Anand, Chakraborty & Park (2016) — We used Neural Networks to Detect Clickbaits: You won't believe what happened Next! — bidirectional LSTM with distributed word embeddings and character-level CNN embeddings for clickbait detection; achieves 98% accuracy and 0.99 ROC-AUC on 15,000-headline dataset, 5% improvement over hand-crafted feature baselines; demonstrates effectiveness of deep learning without feature engineering for headline classification.
Lazer et al. (2018) — The Science of Fake News — multidisciplinary Science article synthesizing knowledge on fake news prevalence, impact, psychological mechanisms, and interventions (individual and platform-based); identifies major gaps and calls for industry-academic collaboration; canonical framework for the field.
Mohseni & Ragan (2018) — Combating Fake News with Interpretable News Feed Algorithms — position paper reviewing fake news detection methods and arguing that transparent, interpretable news feed algorithms could mitigate misinformation amplification by increasing user awareness of algorithmic curation; identifies echo chambers and filter bubbles as key mechanisms of harm.
Guess, Nagler & Tucker (2019) — Less than you think: Prevalence and predictors of fake news dissemination on Facebook — empirical study linking survey data (N=3,500) to Facebook profiles (N=1,191); establishes that fake news sharing during 2016 was rare (8.5% of users); identifies age as the strongest demographic predictor—users 65+ shared nearly 7 times as many fake news articles as those 18–29; effect persists after controlling for ideology and education.
Tandoc, Lim & Ling (2017) — Defining "Fake News": A typology of scholarly definitions — systematic review of 34 academic studies (2003–2017) that define and operationalize "fake news"; proposes a two-dimensional typology (facticity × intent to deceive) identifying six types: satire, parody, fabrication, manipulation, advertising, and propaganda; foundational for conceptual clarity in the field.
Lewandowsky et al. (2012) — Misinformation and its correction: Continued influence and successful debiasing — Psychological Science in the Public Interest foundational review synthesizing literature on why false beliefs persist despite corrections; examines cognitive mechanisms (mental models, source confusion, fluency, coherence) and proposes evidence-based debiasing strategies (warnings, alternative explanations, repeated corrections, worldview-consonant framing) with practical guidance for practitioners.
Zhou, Zafarani, Shu & Liu (2019) — Fake News: Fundamental Theories, Detection Strategies and Challenges — WSDM '19 tutorial survey synthesizing 20+ interdisciplinary theories (psychology, social science, economics, forensics) explaining why misinformation succeeds and why people spread it; unified framework of four detection perspectives (knowledge, style, propagation, credibility); identifies open challenges in timeliness, cross-domain transfer, and efficiency for real-world deployment.
Douglas, Sutton & Cichocka (2017) — The Psychology of Conspiracy Theories — Current Directions in Psychological Science review synthesizing two decades of empirical research; proposes unified taxonomy of epistemic (seeking understanding and certainty), existential (seeking control and security), and social (seeking positive in-group identity) motives driving conspiracy-theory belief; critical finding that conspiracy theories appear to frustrate rather than satisfy these underlying needs, making them a self-defeating form of motivated reasoning.
van Prooijen & Douglas (2018) — Belief in conspiracy theories: Basic principles of an emerging research domain — European Journal of Social Psychology special issue introduction synthesizing the emerging research domain around four foundational principles: conspiracy beliefs are consequential (with real impacts on health and relationships), universal (across cultures and historical periods), emotional (driven by sense-making rather than logic), and social (rooted in intergroup conflict); provides organizing framework for understanding why conspiracy theories appeal across diverse contexts.
Acerbi (2019) — Cognitive attraction and online misinformation — Palgrave Communications content analysis of 260 articles from 26 hoax websites showing that misinformation succeeds due to psychological appeal rather than media inefficiency; 86% of articles contain threat-related content (28%), negative framing (49%), social information (50%), or cognitive-preference elements; reframes misinformation as "high-quality" when measured by cognitive appeal, not truthfulness.
Treen, Williams & O'Neill (2020) — Online misinformation about climate change — comprehensive literature review synthesizing research across communication, psychology, computer science, and political science on climate change misinformation; defines concepts (misinformation vs. disinformation), identifies actor networks (scientists, governments, industry, media, think tanks), examines spread mechanisms via social media (homophily, echo chambers, algorithmic bias), analyzes impacts on policy and public attitudes, reviews countermeasures (inoculation, correction, detection, platform mechanisms).
Farrell (2016) — Corporate funding and ideological polarization about climate change — PNAS empirical study combining Structural Topic Modeling on 40,785 texts with organizational network analysis of 164 climate contrarian organizations (1993–2013); demonstrates that corporate funding from ExxonMobil and Koch foundations directly influences thematic content of polarization efforts, with funded organizations emphasizing energy-production-friendly and scientific-skepticism frames; provides empirical evidence for long-suspected dynamics of how private funding shapes public scientific discourse.
van der Linden et al. (2017) — Inoculating the Public against Misinformation about Climate Change — large-scale randomized experiments (N=2,167) testing attitudinal inoculation against competing consensus-related misinformation; consensus messaging increases perceived agreement 20 percentage points, but is nullified by competing claims; pre-emptive warnings and refutations preserve two-thirds of the effect across political spectrum, with no evidence of backfire.
Cook, Lewandowsky & Ecker (2017) — Neutralizing misinformation through inoculation: Exposing misleading argumentation techniques reduces their influence — PLOS ONE empirical study (N=714 and 392 in two experiments) testing inoculation theory on climate change misinformation; demonstrates that pre-exposure to explanations of flawed argumentation techniques (false balance, fake experts) neutralizes misinformation and reduces politically motivated polarization.
Pennycook et al. (2021) — Shifting attention to accuracy can reduce misinformation online — Nature paper demonstrating that subtle reminders to focus on accuracy increase sharing of accurate news across six survey experiments and a Twitter field experiment; identifies limited attention (not confusion or indifference) as the primary mechanism driving misinformation sharing; shows that reorienting people's limited cognitive attention toward accuracy can substantially improve the quality of shared information online.
Guess et al. (2020) — A digital media literacy intervention increases discernment between mainstream and false news in the United States and India — large-scale RCT testing Facebook's "Tips to Spot False News" platform-based intervention across three samples; finds 26.5% improvement in discernment in nationally representative US sample, 17.3% in India online, and no effect in rural face-to-face sample; effects persist ~3 weeks in US but decay over time; demonstrates that simple, scalable media literacy teaching can improve news evaluation ability but effectiveness varies by digital experience and context.
Roozenbeek & van der Linden (2019) — Fake news game confers psychological resistance against online misinformation — Palgrave Communications empirical study (N=15,000) demonstrating that active inoculation through a gamified ~15-minute intervention teaching six deception techniques (impersonation, polarisation, emotional manipulation, conspiracy theories, discrediting, trolling) significantly reduces perceived reliability of fake news across education, age, and political ideology; largest effect among those most vulnerable to misinformation.
Ecker et al. (2022) — The psychological drivers of misinformation belief and its resistance to correction — Nature Reviews Psychology comprehensive review of cognitive, social, and affective mechanisms in false belief formation; barriers to belief revision (continued influence effect); evidence-based interventions (prebunking and debunking); implications for journalists, policymakers, information consumers, and health communicators.
Wardle & Derakhshan (2017) — Information Disorder: Toward an Interdisciplinary Framework for Research and Policy Making — canonical framework distinguishing mis-, dis-, and mal-information by falseness and intent-to-harm; agent-message-interpreter model; creation-production-distribution lifecycle analysis; 34 policy recommendations for platforms, governments, media, and civil society.
Jack (2024) — Lexicon of Lies: Terms for Problematic Information — Data & Society practitioner's guide to terminology; clarifies distinctions between misinformation, disinformation, propaganda, information operations, gaslighting, and related concepts; examines challenges in establishing intent and cross-cultural complications.
Marwick & Lewis (2017) — Media Manipulation and Disinformation Online — Data & Society ecosystem-level analysis of internet subcultures' tactics, actors, and platform vulnerabilities; case studies of Gamergate, Pizzagate, and 2016 election manipulation.
Papasavva et al. (2021) — The Gospel According to Q: Understanding the QAnon Conspiracy from the Perspective of Canonical Information — empirical study of the QAnon conspiracy theory analyzing 4,961 unique Q drops from six aggregation sites, 121,956 Reddit posts, and related 4chan/8kun content; demonstrates poor canonicalization of Q drops across aggregation sites, provides stylometric evidence of multiple authors, and traces QAnon's transition from fringe imageboards to mainstream social networks via Reddit's crucial intermediary role.
Allcott & Gentzkov (2017) — Social Media and Fake News in the 2016 Election — first comprehensive empirical evidence on fake news exposure; database of 156 election-related false stories, web traffic data, and post-election survey of 1,208 adults; estimates average American saw 1.14 fake articles; documents 3:1 partisan asymmetry (pro-Trump articles shared 30M times vs. pro-Clinton 7.6M); economic model of fake news supply/demand; argues electoral impact smaller than single TV ad.
Sahly, Shao & Kwon (2019) — Social Media for Political Campaigns: An Examination of Trump's and Clinton's Frame Building and Its Effect on Audience Engagement — comparative content analysis of frame building in 2016 campaign across Twitter (3,805 Trump, 655 Clinton tweets) and Facebook (655 posts); finds Trump relied on conflict and negative emotion frames while Clinton used morality and positive frames; frame effects on engagement (retweets, shares) consistent on Twitter but platform-specific on Facebook
Nelson & Taneja (2018) — The small, disloyal fake news audience — empirical audience measurement using comScore data showing fake news reaches only 675K unique monthly visitors vs. 28M for real news; applies the Law of Double Jeopardy to show fake news audiences are small and disloyal; demonstrates that audience availability (time spent online) is a stronger predictor of misinformation exposure than demographics; shows 80% of fake news traffic originates from social platforms, particularly Facebook.
Allen et al. (2020) — Evaluating the fake news problem at the scale of the information ecosystem — multimode national dataset (Nielsen TV, Comscore desktop/mobile) spanning 2016–2018; fake news comprises 0.15% of daily media diet; TV dominates news 5:1 over online; reframes misinformation debate toward mainstream media bias and news avoidance rather than overt fakery.
Helmus et al. (2018) — How to Counter Russian Social Media Influence in Eastern Europe — RAND Corporation report analyzing Russian state-sponsored social media campaigns; documents coordinated troll networks, bot accounts, fake hashtags, and nonattributed comments targeting Eastern European publics; mixed-methods approach combining quantitative social media analysis with expert interviews; identifies counter-strategies including accelerated detection, alternative narratives, and institutional capacity building.
Friggeri et al. (2014) — Rumor Cascades — large-scale empirical study of 16,672 rumor cascades on Facebook using Snopes.com ground truth; shows rumor cascades run deeper than typical content; finds true rumors more viral than false despite false rumors dominating uploads (62% vs. 45% on Snopes); Snopes fact-checks increase deletion likelihood 4.4× for false rumors but have minimal long-term propagation effects; demonstrates rumor mutation and variant selection over time.

Network and graph algorithms¶

Mao et al. (2024) — Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques — comprehensive survey of integrating LLMs with graph representation learning; proposes novel taxonomy decomposing models into primary components (knowledge extractors for attributes, structures, and labels; knowledge organizers as GNN-centric, LLM-centric, or hybrid) and operation techniques (integration strategies at input, hidden, and alignment-based levels; training strategies via pre-training, prompting, and instruction tuning); essential framework for understanding emerging graph foundation models that combine graph structure with semantic information.
Vatter, Mayer & Jacobsen (2023) — The Evolution of Distributed Systems for Graph Neural Networks and their Origin in Graph Processing and Deep Learning: A Survey — comprehensive survey of distributed systems for scalable GNN training, bridging graph processing systems (Pregel, PowerGraph, GraphLab) and DNN training frameworks; systematically categorizes partitioning strategies, sampling techniques, inter-process communication, synchronization modes, and programming abstractions across 20+ systems (DGL, GraphSAINT, DistDGL, etc.); essential for understanding how to scale GNN-based misinformation detection to large social networks.
Dai et al. (2023) — A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability in ACM Computing Surveys — comprehensive survey of trustworthy GNN research covering privacy attacks (membership inference, property inference, reconstruction) and defenses (differential privacy, federated learning, machine unlearning), adversarial robustness methods, fairness approaches to prevent discrimination, and explainability techniques; essential for understanding GNN-based approaches to misinformation detection and their real-world deployment challenges.
Fortunato, S. (2009) — Community detection in graphs — comprehensive 103-page survey covering algorithm taxonomy (hierarchical clustering, spectral methods, modularity optimization), theoretical foundations (NP-hardness, quality functions), benchmarking on standard networks, and applications across biological, social, and technological networks; provides foundational theory and methods for network-based approaches to misinformation detection and information diffusion analysis.

Rumour verification and stance¶

Zubiaga et al. (2016) — Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations — introduces Linear and Tree Conditional Random Fields (CRF) to exploit sequential and tree-structured patterns in rumour conversations; demonstrates that modelling conversational tree structure significantly improves stance classification compared to non-sequential baselines on PHEME dataset
Kochkina, Liakata & Augenstein (2017) — Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM — best-performing system in RumourEval 2017 Subtask A; proposes Branch-LSTM architecture that decomposes conversation trees into linear branches and models them sequentially; achieves 78.4% accuracy using LSTM layers processing tweet sequences with word2vec and hand-crafted lexical/relational features.
Kumar & Carley (2019) — Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations — proposes Tree LSTM and Binarized Constituency Tree (BCTree) LSTM with convolution units for joint stance and veracity classification; uses multi-task learning showing complementary task benefits; achieves 0.520 mean F1 for stance and 0.379 for veracity on PHEME dataset, outperforming prior work by 12% and 15% respectively.
Zubiaga et al. (2018) — Detection and Resolution of Rumours in Social Media: A Survey — authoritative survey of end-to-end rumour classification pipeline: definition and typology (newly emerging vs. long-standing), data collection APIs and annotation schemes, detection approaches for both rumour types, tracking and filtering, stance classification (SDQC: Support/Deny/Query/Comment), and veracity classification; reviews datasets (PHEME, RumourEval), approaches from feature engineering to deep learning, and key challenges including class imbalance, vocabulary drift, and cross-platform generalization.
Kochkina, Liakata & Zubiaga (2018) — All-in-one: Multi-task Learning for Rumour Verification — demonstrates multi-task learning improvements for veracity classification by jointly training with stance and detection as auxiliary tasks; shows 13.6% improvement over single-task baselines on RumourEval and up to 28.9% on PHEME; analyzes link between dataset properties (kurtosis, entropy) and MTL effectiveness.
Hardalov et al. (2021) — A Survey on Stance Detection for Mis- and Disinformation Identification — comprehensive survey of stance detection (determining whether a text supports, denies, questions, or comments on a claim) reviewing task formulations, datasets across 15+ benchmarks and multiple languages, and approaches from feature engineering to pre-trained language models; covers applications to fact-checking, rumour verification, and propaganda detection.

Stance detection¶

Shared tasks and benchmarks¶

The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength — Clickbait Challenge 2017 shared task with 38,517 graded-scale annotated tweets; 13 submitted systems achieving significant performance gains over prior baselines; reformulates clickbait detection as regression to measure strength rather than binary classification; introduces Webis Clickbait Corpus 2017
A Benchmark Study of Machine Learning Models for Online Fake News Detection — Comprehensive empirical benchmark comparing 19 machine learning models (8 traditional, 6 deep learning, 5 pre-trained transformers) on three fake news datasets spanning politics, health, and diverse topics; finding: BERT-based pre-trained models (RoBERTa 96% accuracy on large datasets, >90% with 500 samples) substantially outperform traditional ML and deep learning approaches; practical guidance for practitioners across resource constraints.
RumourEval 2019: Determining Rumour Veracity and Support for Rumours — extended shared task with Twitter and Reddit data; two subtasks: (A) SDQC stance classification on 8,574 conversation posts, (B) veracity prediction on 446 rumours; 22 system submissions (70% increase from 2017); best systems employ pre-trained contextual embeddings (BERT, GPT); demonstrates that conversation structure and ensemble approaches advance rumour verification beyond single-task specialization.
SemEval-2017 Task 8: RumourEval — benchmark shared task establishing foundation for rumour verification research; defines two subtasks: (a) SDQC stance classification (Support/Deny/Query/Comment) of replies to rumourous claims, and (b) veracity prediction (true/false) of source tweets; provides datasets from 10 events with 297 training threads, 28 test threads, and 1,080 tweets; 13 systems from 4 continents participated; results show stance classification achievable (78% best), but veracity prediction remains hard (below baselines).
Kochkina, Liakata & Augenstein (2017) — Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM — best-performing system in RumourEval 2017 Subtask A; proposes Branch-LSTM architecture that decomposes conversation trees into linear branches and models them sequentially; achieves 78.4% accuracy using LSTM layers processing tweet sequences with word2vec and hand-crafted lexical/relational features.
Thorne et al. (2018) — The Fact Extraction and VERification (FEVER) Shared Task — first shared task combining evidence retrieval and natural language inference for fact verification; 23 teams, 185,445 human-generated claims verified against Wikipedia; best system achieves 64.21% FEVER score; analysis reveals three-stage pipeline architecture (document selection → sentence selection → NLI) is dominant; post-competition evidence augmentation identified 308 new evidence sets and corrected label errors.
Da San Martino, Barrón-Cedeño & Nakov (2019) — Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection — shared task on propaganda technique identification in news articles; two subtasks: FLC (fragment-level with 18-way technique classification) and SLC (sentence-level binary); 90 registered teams with 39 submitting predictions; winning systems use fine-tuned BERT achieving 0.63 F1 (SLC) and 0.25 F1 (FLC); corpus of 497 annotated articles with fragment-level annotations enables interpretable propaganda analysis.

Datasets and resources¶

Zampieri et al. (2019) — Predicting the Type and Target of Offensive Posts in Social Media — OLID: 14,100 English tweets annotated with hierarchical three-level schema (offensive detection, type categorization: targeted/untargeted, target identification: individual/group/other); foundational dataset for offensive language detection, hate speech, and cyberbullying; baseline CNN achieves 0.80 macro-F1 on offensive detection.
Hanselowski et al. (2019) — A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking — SNOPES corpus with 6,422 validated claims and 14,296 documents from heterogeneous web sources; comprehensive annotations for document retrieval, evidence extraction (fine-grained), stance detection, and claim validation; demonstrates that multi-domain, multi-source fact-checking is more challenging than Wikipedia-only datasets
Horne et al. (2018) — Sampling the News Producers: A Large News and Feature Data Set for the Study of the Complex Media Landscape (NELA2017) — 1,586 articles from 92 diverse news sources (mainstream, hyper-partisan, satire, misinformation); 130 content-based linguistic and engagement features enabling source characterization; foundation for the NELA-GT series.
Dolhansky et al. (2020) — The DeepFake Detection Challenge (DFDC) Dataset — largest deepfake detection benchmark with 128,154 videos from 3,426 consenting actors and >100,000 face-swapped clips; addresses ethical limitations of prior datasets; includes diverse face-swap generation methods (DFAE, MM/NN, NTH, FSGAN, StyleGAN); public Kaggle competition demonstrates detection remains unsolved despite scale and diversity.
Gruppi, Horne & Adalı (2022) — NELA-GT-2022: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles — 1.78M news articles from 361 outlets spanning all of 2022 with source-level veracity labels from Media Bias/Fact Check (factuality scores 0–5, conspiracy/pseudoscience classification) and 346K embedded tweets; fifth release in NELA-GT series providing 5.5+ years of longitudinal coverage; benchmark for robust fake news detection and event-driven media analysis.
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection: Introduces Fakeddit, a large-scale multimodal dataset with 1.06M Reddit submissions, 64% text+image pairs, and 2-way/3-way/6-way labels; demonstrates that multimodal models (BERT + ResNet50 with maximum fusion) achieve 85.88% 6-way accuracy, ~10 percentage points above text-only baselines; identifies satire and imposter content as hardest categories.
Wang (2017) — Liar, Liar Pants on Fire: A New Benchmark Dataset for Fake News Detection — 12,836 labeled political statements from PolitiFact spanning 2007–2016 with 6-way fine-grained labels (pants-fire, false, barely-true, half-true, mostly-true, true) and rich metadata (speaker identity, party, state, job, credit history); detailed fact-check justifications and supporting documents; foundational benchmark for statement-level fact-checking research.
Augenstein et al. (2019) — MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims — largest publicly available real-world fact-checking dataset with 34,918 naturally occurring claims from 26 fact-checking websites; includes entity linking to Wikipedia (25,763 unique entities) and evidence pages retrieved via Google Search; multi-task learning approach for veracity prediction across domains with heterogeneous label schemas achieves 49.2% Macro F1
Thorne et al. (2018) — FEVER: A Large-Scale Dataset for Fact Extraction and VERification — dataset of 185,445 human-verified claims with Wikipedia evidence; three-class labels (SUPPORTED/REFUTED/NOT ENOUGH INFO) with sentence-level evidence annotation; baseline system combines document retrieval, sentence selection, and textual entailment achieving 31.87% accuracy with correct evidence requirement
Shu et al. (2018) — FakeNewsNet: A Data Repository with News Content, Social Context and Spatiotemporal Information for Studying Fake News on Social Media — multi-dimensional benchmark dataset containing PolitiFact (12,911 articles) and GossipCop (22,140 articles) with news content (text, images), social context (user engagement, network structure, user profiles), and spatiotemporal information (location, timestamps); integrates multiple feature families enabling research into detection, evolution, mitigation, and bot detection.

Media profiling and source credibility¶

Baly et al. (2018) — Predicting Factuality of Reporting and Bias of News Media Sources — predicts news outlet-level factuality (low/mixed/high) and political bias (7-point scale) using 141 features from article text, Wikipedia, Twitter, URL structure, and web traffic; introduces 1,066-website dataset manually annotated for both tasks; demonstrates textual features most predictive for factuality (58.02 Macro-F₁), while Wikipedia and Twitter features important for bias; feature ablation shows complementary value of diverse signals
Baly et al. (2019) — Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media — extends media profiling to jointly model trustworthiness (3-point scale) and political ideology (7-point scale) via Copula Ordinal Regression; multi-task learning with auxiliary tasks reduces mean absolute error; shows correlation between extreme bias and low factuality
Baly et al. (2020) — What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context — combines article text features with YouTube audience demographics and Wikipedia content for outlet-level bias and factuality prediction; shows multimodal approach improves performance over text-only baselines
Nakov et al. (2021) — A Survey on Predicting the Factuality and the Bias of News Media — comprehensive survey of media-level profiling methods; reviews textual, multimedia, audience, and infrastructure features; discusses challenges in ordinal bias scales, multimodality, and temporal variation

Deception and behavioral detection¶

A Deep Learning Approach for Multimodal Deception Detection — Multimodal neural networks for deception detection using 3D-CNN on video, audio features, textual CNN, and micro-expressions; achieves 96.14% accuracy on courtroom trial videos, substantially outperforming traditional classifiers

Mainstream media dissemination¶

Mihailidis & Viotty (2017) — Spreadable Spectacle in Digital Culture: Civic Expression, Fake News, and the Role of Media Literacies in "Post-Fact" Society — critical analysis of how mainstream media coverage paradoxically legitimizes spectacle while attempting to debunk it; uses Pizzagate conspiracy theory as case study showing how initial Reddit-based spectacle spreads through online communities, then becomes amplified by mainstream reporting; argues media literacy must be repositioned toward critique, creation, and civic engagement rather than individual fact-checking.
Tsfati et al. (2020) — Causes and consequences of mainstream media dissemination of fake news: literature review and synthesis — comprehensive literature review examining the understudied role of mainstream outlets (not fake news websites) in disseminating misinformation; identifies four structural reasons why mainstream media cover fake news (journalistic duty to expose falsehood, news values criteria, psychology of newsroom decision-making, technical infrastructure for monitoring); synthesizes psychological mechanisms explaining why mainstream coverage of false claims often backfires (fluency effects, negation accessibility, mental model persistence, motivated reasoning); argues mainstream media paradoxically function as both important correctors and amplifiers of misinformation.

COVID-19 pandemic infodemic¶

Memon & Carley (2020) — Characterizing COVID-19 Misinformation Communities Using a Novel Twitter Dataset — Twitter dataset (CMU-MisCOV19) with 4,573 manually annotated tweets across 17 categories; compares misinformed vs. informed communities on network structure (density), bot coordination (19% vs. 11%), linguistic patterns (narratives), and vaccination stance; finds misinformed communities denser and more organized, with evidence of organized disinformation campaigns.
van der Linden, Roozenbeek & Compton (2020) — Inoculating Against Fake News About COVID-19 — perspective on applying psychological inoculation to the COVID-19 infodemic; documents scope of pandemic misinformation (46–48% exposure in UK/US; 25%+ of top YouTube videos misleading); links conspiracy belief to vaccine hesitancy, reduced compliance, and violent intentions; proposes prebunking via gamified interventions (Bad News, Go Viral!) as scalable alternative to reactive fact-checking.
Cui & Lee (2020) — CoAID: COVID-19 Healthcare Misinformation Dataset — benchmark dataset with 4,251 news articles (204 fake, 3,565 true), 482 claims, 926 social posts, and 296,000 user engagements (tweets/replies); multiplatform (websites, Twitter, Facebook, Instagram, YouTube, TikTok); benchmarks detection methods from SVM to state-of-the-art attention-based models.
Fighting an Infodemic: COVID-19 Fake News Dataset — dataset of 10,700 COVID-19 posts/articles with binary real/fake labels collected from social media and fact-checking websites; benchmarks ML baselines achieving 93.32% F1-score with SVM.
Cinelli et al. (2020) — The COVID-19 Social Media Infodemic — large-scale comparative analysis of COVID-19 information diffusion across five platforms (Twitter, Instagram, YouTube, Reddit, Gab) with 1.3M+ posts from 3.7M+ users; applies epidemic models (EXP and SIR) to estimate platform-specific basic reproduction numbers (R₀); all platforms show infodemic conditions (R₀ > 1); characterizes platform-specific amplification of misinformation with coefficients ranging from 10% (YouTube) to 400% (Gab).
Pennycook et al. (2020) — Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention — pre-registered experiments (1,700+ participants) demonstrating accuracy-salience nudge intervention triples discernment in shared content decisions; effects uniform across partisanship, education, and geographic proximity to COVID-19.
Lee et al. (2020) — Misinformation Has High Perplexity — proposes language model perplexity as a falseness signal for evidence-based debunking; achieves 75% accuracy on scientific COVID-19 claims using GPT-2 with minimal labeled data; releases Covid19-scientific and Covid19-politifact test sets.
Roozenbeek et al. (2020) — Susceptibility to misinformation about COVID-19 around the world — international survey of 3,750 adults across five countries identifying psychological predictors of COVID-19 misinformation belief; susceptibility predicts vaccine hesitancy and reduced health-guidance compliance.
Zhou et al. (2020) — ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research — multimodal dataset of 2,029 COVID-19 news articles with credibility labels; benchmarks SAFE achieving best F₁ 0.833/0.672.
Yang et al. (2020) — CHECKED: Chinese COVID-19 Fake News Dataset — first Chinese-language COVID-19 misinformation dataset with 2,104 Weibo microblogs, expert labels, and 1.87M repost propagation graphs.
Li et al. (2020) — MM-COVID: A Multilingual and Multimodal Data Repository for Combating COVID-19 Disinformation — multilingual multimodal dataset spanning English, Arabic, Chinese with verified credibility labels.
Du et al. (2021) — Cross-lingual COVID-19 Fake News Detection — addresses misinformation in low-resource languages (Chinese) by training on English COVID-19 news and applying via machine translation; CrossFake framework using BERT with sub-text slicing achieves 75% accuracy on manually-curated Chinese dataset (86 fake, 114 real), outperforming monolingual and cross-lingual baselines.

Synthetic media and deepfakes¶

Mirsky & Lee (2020) — The Creation and Detection of Deepfakes: A Survey — comprehensive 38-page survey covering both creation and detection methodologies; systematically reviews generative architectures (GANs, VAEs, CNNs, RNNs), technical approaches to reenactment/replacement/editing/synthesis, artifact-specific and undirected detection methods; identifies arms race dynamics and current technological limitations.
Tolosana et al. (2020) — DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection — comprehensive survey of facial manipulation techniques (entire face synthesis, identity swap, attribute manipulation, expression swap) and detection methods; covers GAN-based generation (StyleGAN, ProGAN), public databases, and state-of-the-art benchmarks showing detection difficulty under cross-domain conditions.
Rana et al. (2022) — Deepfake Detection: A Systematic Literature Review — comprehensive SLR of 112 deepfake detection papers (2018–2020) with rigorous methodology; organizes 77% deep learning (primarily CNNs), 18% machine learning, 3% statistical, and 2% blockchain-based techniques; synthesizes datasets (FaceForensics++, DFDC, DeeperForensics); evaluates 100+ detection models and 10+ feature types; finds deep learning achieves 89.7% mean accuracy vs. 85% for traditional ML; identifies standardization gaps in evaluation and future research directions.
Ba et al. (2024) — Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection — information-theoretic framework decomposing facial features into disentangled local representations and aggregated global representations using mutual information losses; achieves 0.983 AUC on FaceForensics++, 0.999 AUC on Celeb-DF-V2, 0.939 AUC on DFDC; demonstrates strong cross-dataset generalization (0.818-0.864 AUC on Celeb-DF when trained on FaceForensics++); addresses overfitting limitations of prior region-specific detection methods.
Cifci, Demir & Yin (2019) — FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals — detects deepfakes via photoplethysmography (blood flow patterns) by analyzing spatial coherence and temporal consistency of biological signals; achieves 91%+ accuracy on Face Forensics++, CelebDF, and UADFV; introduces "in the wild" Deep Fakes Dataset; demonstrates biological signal inconsistencies as orthogonal detection signal to visual artifacts.
Sabir et al. (2019) — Recurrent Convolutional Strategies for Face Manipulation Detection in Videos — recurrent-convolutional networks exploiting temporal discrepancies for detecting Deepfake, Face2Face, and FaceSwap; combines face alignment preprocessing with bidirectional GRU cells operating on frame sequences; achieves 96.9%, 94.35%, and 96.3% accuracy respectively on FaceForensics++, improving prior state-of-the-art by up to 4.55%; shows bidirectional temporal recurrence essential while multi-level recurrence hurts due to limited training data.
DeepFakes: a New Threat to Face Recognition? Assessment and Detection — first publicly available GAN-based Deepfake database (620 videos from 16 VidTIMIT subject pairs); demonstrates that VGG and FaceNet face recognition systems achieve FAR of 85.62% and 95.00% on high-quality deepfakes; evaluates detection methods showing audio-visual lip-sync approaches fail entirely while image quality metrics achieve 8.97% EER
Rössler et al. (2019) — FaceForensics++: Learning to Detect Manipulated Facial Images — largest facial forgery benchmark (1.8M+ images from 1K+ videos) with four manipulation methods (Face2Face, FaceSwap, DeepFakes, NeuralTextures); comprehensive evaluation of detection methods from stegananalysis to CNN-based approaches; human baseline (68.7% accuracy) vs. XceptionNet (99.26%); systematic analysis of compression robustness showing significant performance degradation under realistic post-processing.
Li, Chang & Lyu (2018) — In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking — detects deepfakes via physiological signal absence (eye blinking); LRCN model combining CNN feature extraction with LSTM temporal modeling achieves 0.99 AUC vs. 0.98 for CNN-only and 0.79 for hand-crafted baselines; exploits fact that deepfake training datasets rarely contain closed-eye images, making natural blink sequences absent from synthesized video.
Zhou et al. (2018) — Two-Stream Neural Networks for Tampered Face Detection — two-stream architecture combining GoogLeNet (high-level visual artifacts) with steganalysis-based triplet network (low-level noise residuals) for detecting face swapping; achieves 0.927 AUC on SwapMe/FaceSwap dataset of 2010 high-quality tampered images; demonstrates robustness to post-processing (resizing, blurring, blending).
McCloskey & Albright (2018) — Detecting GAN-generated Imagery using Color Cues — forensic detection of GAN-generated images by analyzing generator network architecture; identifies two cues (color channel overlap and saturation suppression) that distinguish GANs from real cameras; saturation-based SVM achieves 0.7 AUC on fully GAN-generated images and 0.61 on face-swapped images.
Vaccari & Chadwick (2020) — Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News — experimental study (N=2,005 UK respondents) on political deepfakes using the widely-circulated Obama/Peele deepfake; finds that deepfakes increase uncertainty about content, and this uncertainty mediates reduced trust in news on social media; deepfakes threaten civic culture through epistemic erosion rather than mass deception; educational interventions showing deepfakes are synthetic can mitigate effects.
Fagni et al. (2020) — TweepFake: about detecting deepfake tweets — first public dataset of human vs. machine-generated tweets; 25,572 tweets from 23 bot accounts (GPT-2, RNN, LSTM, Markov, etc.) and 17 human accounts; benchmarks 13 detection methods; finds transformer-based fine-tuned models (RoBERTa) achieve 90% accuracy, character-level encodings effective for short text, but GPT-2 tweets remain challenging (65–80% accuracy).
Yang, Li & Lyu (2018) — Exposing Deep Fakes Using Inconsistent Head Poses — forensic detection method exploiting facial landmark misalignment in deepfake generation pipeline; compares 3D head poses estimated from all facial landmarks vs. central region only; real faces show consistent poses (cosine distance <0.02) while deepfakes exhibit large divergence (0.02–0.08); SVM classifier achieves 89.0% AUROC on deepfake videos and 84.3% on diverse face-swap dataset; demonstrates that synthesis errors invisible to human eye are detectable through geometric constraints.
Afchar et al. (2018) — MesoNet: A Compact Facial Video Forgery Detection Network — lightweight CNN architectures (Meso-4 and MesoInception-4) for detecting Deepfake and Face2Face forgeries at mesoscopic level; achieves 98% detection accuracy for Deepfake and 95% for Face2Face under realistic compression; introduces first publicly available Deepfake dataset with 175 videos; demonstrates that efficient networks with ~28K parameters match or exceed complex architectures while remaining computationally practical.

Offensive AI & threat modeling¶

Mirsky et al. (2021) — The Threat of Offensive AI to Organizations — comprehensive survey of 33 offensive AI capabilities (OACs) adversaries use to attack organizations, categorized into automation, campaign resilience, credential theft, exploit development, information gathering, social engineering, and stealth. Through expert user study (N=22), ranks threats by profit/achievability/defeatability/harm, finding exploit development, social engineering, and information gathering pose the greatest risk. Develops threat model T = H × (M/D) enabling practitioners to prioritize defensive investments. Particularly relevant for understanding AI-enabled social engineering attacks (deepfakes, impersonation, phishing) and model extraction threats to misinformation detection systems.

AI safety & governance¶

Mehrabi et al. (2023) — FLIRT: Feedback Loop In-context Red Teaming — Automated red teaming framework using in-context learning to generate adversarial prompts targeting generative models. Red language model generates prompts without fine-tuning; outputs are evaluated for safety, and feedback refines future generations. Proposes multiple attack strategies (FIFO, LIFO, Scoring, Scoring-LIFO) balancing effectiveness vs. diversity; demonstrates 80%+ attack success on vanilla Stable Diffusion and 60%+ on safeguarded variants, substantially outperforming prior manual and weakly-automated approaches. Shows attacks transfer across text-to-image models and extends to text-to-text models (GPT-Neo).
Perez et al. (2022) — Discovering Language Model Behaviors with Model-Written Evaluations — proposes using language models to generate high-quality evaluations for testing diverse model behaviors. Generates 154+ datasets testing 154 behaviors across personality, goal-seeking, politics, and ethics. Discovers inverse scaling phenomena where larger models exhibit worse behavior on some safety-relevant tasks (stronger political views, greater desire to avoid shutdown, increased sycophancy). Shows RLHF training amplifies political bias and can create unintended instrumental subgoals. Demonstrates that smaller preference models effectively predict RLHF model behavior, enabling early detection of safety concerns before full deployment.
Perez et al. (2022) — Red Teaming Language Models with Language Models — demonstrates automated red teaming to systematically discover harmful behaviors in language models. Uses one LM to generate adversarial test cases probing another LM for offensive replies, data leakage, personal information generation, and distributional biases. Explores zero-shot, few-shot, supervised, and reinforcement learning methods; RL achieves 27-42% offensive reply elicitation rates. Uncovers 1709 training data leakage instances and reveals that models discuss different demographic groups with significantly different offensiveness rates. Foundational work showing language models can complement human red teaming at scale.
Bang et al. (2024) — Measuring Political Bias in Large Language Models: What Is Said and How It Is Said — evaluates political bias in LLM-generated content via two-tiered framework separating political stance (extreme anchor comparison) from framing bias (content and style components). Tests 11 open-source LLMs (LLaMa-2, Yi, Vicuna, Falcon, etc.) on 14 politically divisive topics; finds models exhibit liberal bias on social issues (same-sex marriage, climate change, public education), US-centric focus despite global training claims, and nuanced issue-specific biases varying across models. Decomposition of bias into content bias (entity/topic selection) and style bias (lexical polarity) provides explainable measurement beyond left-right spectrum.
Goldstein et al. (2023) — Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations — threat assessment of how generative language models could expand the scale and sophistication of influence operations and propaganda campaigns. Uses ABC framework (Actors, Behaviors, Content) to analyze threats; proposes comprehensive mitigation taxonomy across four intervention points (model design & construction, model access, content dissemination, belief formation). Evaluates mitigations for technical feasibility, social feasibility, downside risks, and impact. Identifies critical unknowns about adoption rates, effectiveness, and norm-setting.
[[2023-shu-exploitability-instruction-tuning|Shu et al. (2023) — On the Exploitability of Instruction Tuning]] — investigates vulnerabilities in instruction-tuned LLMs to data poisoning attacks via AutoPoison, an automated pipeline generating high-quality poisoned training data; demonstrates content injection and over-refusal attacks that scale to larger models while maintaining fluency; shows instruction tuning's low sample complexity is a double-edged sword enabling both capability learning and behavior hijacking.
Evans et al. (2021) — Truthful AI: Developing and Governing AI That Does Not Lie — policy and governance framework for preventing AI systems from generating false or misleading statements. Proposes conceptual distinctions between lies, negligent falsehoods, and truthfulness; describes institutional arrangements for AI truthfulness standards (industry self-regulation, co-regulation, top-down regulation); outlines technical approaches to developing truthful AI. Argues early standards-setting is crucial before AI capabilities in strategic deception exceed human capacity.
Wei, Haghtalab & Steinhardt (2023) — Jailbroken: How Does LLM Safety Training Fail? — analyzes fundamental failure modes in safety-trained language models through two mechanisms: competing objectives (where safety training conflicts with pretraining-induced instruction following) and mismatched generalization (where safety training fails to cover capabilities developed during pretraining). Develops 30 jailbreak methods and tests on GPT-4, Claude v1.3, and GPT-3.5 Turbo, finding vulnerabilities persist despite extensive red-teaming. Argues that scaling and additional red-teaming alone cannot resolve fundamental tensions in how LLMs are trained for safety.

Platform governance and regulation¶

Gorwa (2019) — The platform governance triangle: conceptualising the informal regulation of online content — applies Abbott and Snidal's governance triangle framework to analyze how informal multi-stakeholder arrangements regulate online platform content across Europe. Maps regulatory initiatives by actor composition (state, firm, NGO) including NetzDG, AVMSD, Code of Conduct on Terror and Hate Content, Code of Practice on Disinformation, Facebook Oversight Board, and Global Network Initiative. Identifies three key dynamics shaping effectiveness: legitimation politics (contested authority), actor competencies (divergent expertise and capacity), and power relations (asymmetries favoring firms); argues effective governance requires collaboration across all three actor types, though such arrangements face ongoing tensions over legitimacy and influence.

Propagation-based detection¶

Shu, Bernard & Liu (2018) — Studying Fake News via Network Analysis: Detection and Mitigation — comprehensive chapter surveying network properties (echo chambers, filter bubbles, malicious accounts), three homogeneous and three heterogeneous network types, feature learning via network embeddings (NMF, RNNs), detection methods (interaction embedding, temporal diffusion, credibility propagation, knowledge network matching), and mitigation strategies (provenance identification, leader selection, influence minimization, mitigating campaigns).
Cheng et al. (2014) — Can Cascades be Predicted? — foundational work showing cascades on social networks are highly predictable (~80% accuracy); temporal features most predictive, followed by structural, resharer, and user features; demonstrates prediction improves with observation window size; reveals fundamental differences between user-initiated and page-initiated cascades.
Li et al. (2016) — DeepCas: an End-to-end Predictor of Information Cascades — first end-to-end deep learning approach to cascade size prediction; represents cascade graphs as random walk paths processed through bidirectional GRU with attention mechanisms; automatically learns cascade representations without hand-crafted features; evaluated on Twitter and academic citation cascades.
Wang et al. (2017) — Topological Recurrent Neural Network for Diffusion Prediction — novel LSTM architecture for dynamic DAGs; models cascades as diffusion topologies showing information spread over network structure; learns topology-aware sender embeddings capturing both node properties and cascade dynamics; achieves 20–56% relative improvement over DeepCas across three real-world networks.
Tacchini et al. (2017) — Some Like it Hoax: Automated Fake News Detection in Social Networks — hoax detection via user interaction patterns on Facebook; proposes that user "likes" encode veracity signals independent of content; two algorithms (logistic regression, harmonic boolean crowdsourcing) achieve >99% and 99.4% accuracy respectively; demonstrates transfer learning across Facebook communities and robustness with minimal labeled training data (<1% of posts); suggests diffusion patterns are a primary detection signal.
Ruchansky, Seo, & Liu (2017) — CSI: A Hybrid Deep Model for Fake News Detection — three-module neural network combining text, temporal engagement response patterns, and user group behavior to identify fake news and suspicious users; Capture module uses LSTM on temporal features; Score module learns user suspiciousness from co-engagement patterns; achieves 89.2% accuracy on Twitter, outperforming text-only and propagation-only baselines; demonstrates value of joint modeling with fewer parameters than competing RNN approaches.
Castillo, Mendoza & Poblete (2011) — Information Credibility on Twitter — foundational work framing Twitter credibility assessment via user reputation and propagation signals; 2,500+ trending topics, human-labeled for newsworthiness and credibility; shows user features (registration age, followers, activity) and propagation structure (retweet tree depth/breadth) are stronger predictors than text alone; achieves 89% accuracy on newsworthy detection, 86% on credibility classification.
Ma, Gao & Wong (2017) — Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning — kernel-based approach (PTK/cPTK) measuring structural similarity between propagation trees; soft-matches subtrees to capture high-order patterns; context-sensitive extension considers propagation paths from root; extends to four-class classification (false/true/unverified/non-rumor); achieves 75.0% accuracy on Twitter15 with superior early detection (75% within 24 hours).
Ma, Gao & Wong (2018) — Rumor Detection on Twitter with Tree-structured Recursive Neural Networks — applies recursive neural networks (bottom-up and top-down variants) to thread propagation trees; learns joint content-structure representations where local patterns (supportive/questioning replies) signal veracity; achieves 72.3% / 73.7% accuracy on Twitter15/16 with superior early detection (8 hours vs. 36 hours to match baseline).
Vosoughi, Roy & Aral (2017) — The Spread of True and False News Online — largest longitudinal study of misinformation diffusion; ~126K verified true/false cascades from Twitter (2006–2017), 3M users, 4.5M shares; falsehood diffuses 6× faster to 1.5K people, reaches depth 19 vs. truth's depth 10 in 1/10 the time; 70% higher retweet likelihood; humans (not bots) responsible; novelty perception key driver.
Shu et al. (2019) — Hierarchical Propagation Networks for Fake News Detection — constructs hierarchical propagation networks from both macro-level (retweet cascades) and micro-level (reply conversations) granularity; extracts structural, temporal, and linguistic features showing fake news spreads deeper, contains more bots, has shorter lifespan, and generates more negative sentiment; hierarchical network features (HPNF) achieve F1 > 0.80, outperforming prior macro-level-only approaches.
Zhou & Zafarani (2019) — Network-based Fake News Detection: A Pattern-driven Approach — four network-structural patterns (More-Spreader, Farther-Distance, Stronger-Engagement, Denser-Network) across five network levels; 138 interpretable features; RF 0.929/0.932 accuracy/F₁ on PolitiFact without reading content; robust to limited early-stage network information.
Monti et al. (2019) — Fake News Detection on Social Media using Geometric Deep Learning — graph convolutional networks on propagation cascades integrating user profiles, activity, social network structure, and spreading patterns; 92.7% ROC AUC on Twitter; user profile and network features are most important (90% AUC combined) while content marginally contributes; enables early detection within 1–2 hours with minimal cascade size (6 tweets).
Dou et al. (2021) — User Preference-aware Fake News Detection (UPFD): Combines endogenous user preferences (from historical tweets) with exogenous news propagation patterns via GNNs. Encodes user preferences and news content using BERT/word2vec, builds Twitter retweet cascade graphs, uses GNN message passing to integrate signals; concatenates user engagement and news textual embeddings for classification. Achieves 84.62% accuracy on Politifact and 97.23% on Gossipcop; ablation studies show both user preference and propagation are necessary; demonstrates confirmation bias is a learnable signal in engagement data.
Lu & Li (2020) — GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media — detects fake news on Twitter using only source tweet text and retweet user sequences (no comments, no explicit network topology); models user propagation with CNN and GRU, constructs fully-connected user interaction graphs weighted by feature similarity, applies GCN and dual co-attention mechanism to jointly highlight suspicious users and informative words; achieves 87.67% accuracy on Twitter15 and 90.84% on Twitter16 (18–20% improvement over prior work); demonstrates early detection at 90% accuracy with only 10 retweets; provides interpretable explanations of suspicious user characteristics and dramatic linguistic markers.

Bot detection¶

Cresci et al. (2016) — DNA-Inspired Online Behavioral Modeling and Its Application to Spambot Detection — encodes user actions as character sequences and applies longest common substring analysis to detect groups of similar accounts; outperforms supervised and unsupervised baselines achieving MCC 0.952 on political bots; demonstrates paradigm shift toward group-level detection of evolved spambots
The Paradigm-Shift of Social Spambots: Evidence, Theories, and Tools for the Arms Race — empirical evidence of a new generation of social spambots that evade all existing detection approaches (Twitter, humans, academic tools); crowdsourced evaluation shows 0.24 accuracy vs 0.91 on traditional bots; paradigm shift toward group-level detection methods
Better Safe Than Sorry: an Adversarial Approach to improve Social Bot Detection — proposes GenBot, a genetic algorithm for synthesizing evolved spambots; demonstrates that evolved bots evade state-of-the-art detection (F₁ ≈ 0.26) while revealing an actionable vulnerability (entropy-based signatures); introduces proactive detection paradigm to anticipate and preemptively defend against future bot evolutions
A Decade of Social Bot Detection — decade-long longitudinal review (2010–2020) of social bot detection research; systematically catalogs 230+ detectors across two dimensions (individual vs. group detection, methodological approach); documents three "waves" of bot evolution and shift from individual-account to group-level detection approaches; analyzes publication trends showing exponential growth post-2014; argues modern ML detectors fail due to non-stationarity and non-neutrality assumptions
Detection of Novel Social Bots by Ensembles of Specialized Classifiers — ensemble of specialized classifiers; shows heterogeneous bot behaviors (spammers, fake followers, political bots) are distinguished by different feature sets; ESC trains per-type classifiers combined via maximum rule; achieves 56% improvement in F1 score for novel bots (47% → 73%); improves cross-domain recall from 42% to 84%; enables efficient adaptation with fewer labeled examples for retraining; deployed in Botometer v4 achieving AUC 0.99
Davis et al. (2016) — BotOrNot: A System to Evaluate Social Bots — publicly available web and API service for classifying Twitter accounts as human or bot using 1,000+ features; Random Forest classifier achieves 0.95 AUC on 15k bots and 16k legitimate accounts; served over one million API requests since 2014 launch
Arming the public with artificial intelligence to counter social bots — comprehensive review of social bot types, activities, and impact; case study of Botometer bot detection tool; user experience survey (N=731) revealing interpretation challenges; proposes calibration methods (Platt scaling, Complete Automation Probability) to make bot scores interpretable; demonstrates model generalization across diverse bot types
Scalable and Generalizable Social Bot Detection through Data Selection — scalable metadata-only bot detection framework using only 20 user features; achieves 900M tweets/day processing speed; compiles 13 labeled datasets (94K bots, 43K humans); demonstrates that strategic data selection (training on curated subset) improves generalization and consistency better than exhaustive training; achieves 0.99 AUC on unseen datasets
Shao et al. (2018) — Anatomy of an online misinformation network — network analysis of fact-checking vs. misinformation diffusion during 2016 U.S. election; k-core decomposition reveals strong segregation between claim and fact-check communities; fact-checking nearly disappears in dense network cores dominated by bots and misinformation spreaders; identifies efficient node-removal strategies for disrupting misinformation circulation
Shao et al. (2017) — The spread of low-credibility content by social bots — large-scale empirical analysis (14M messages, 400K articles) during 2016 U.S. election; shows only 6% of accounts spreading misinformation are bots but they account for 31% of tweet volume; bots employ early amplification strategy and target influential users; network dismantling analysis shows removing bots is critical for reducing misinformation spread
Varol et al. (2017) — Online Human-Bot Interactions: Detection, Estimation, and Characterization — large-scale machine-learning framework extracting 1,150 behavioral features to classify bots from Twitter accounts; evaluates on 14M accounts; estimates 9–15% of active users are bots; clustering analysis reveals behavioral phenotypes; demonstrates concept drift in detection systems
Ferrara et al. (2015) — The Rise of Social Bots — foundational survey of social bot phenomenon and detection methods; proposes taxonomy dividing approaches into graph-based detection (network structure), crowd-sourced detection (human judgment), and feature-based detection (behavioral patterns); analyzes characteristics distinguishing bots from humans and discusses arms race between sophistication and detection.
Ayoobi, Shahriar & Mukherjee (2023) — The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention — Dataset of 3,600 LinkedIn profiles (1,800 legitimate, 600 human-created fake, 1,200 ChatGPT-generated) and Section and Subsection Tag Embedding (SSTE) method for detection; achieves 95% accuracy distinguishing legitimate from fake profiles, and 70%+ accuracy on unseen ChatGPT-generated profiles despite absence from training; demonstrates that minimal LLM-generated training samples suffice for generalization to diverse LLM outputs.

Media manipulation & coordinated campaigns¶

Hosseinmardi et al. (2022) — Examining the consumption of radical content on YouTube — large-scale empirical audit (N=309.8K Americans, 2016–2019) of YouTube's role in radicalization; finds far-right content is small (0.05% of users) and stable; anti-woke content grew steadily; 55% of far-right views arrive via external URLs/search/homepage rather than recommendations; no within-session escalation toward extremity; on/off-platform consumption patterns align (2–3× co-consumption), suggesting user preference over algorithmic steering; concludes YouTube is one library in a larger ecosystem.
Munger & Phillips (2022) — Right-Wing YouTube: A Supply and Demand Perspective — longitudinal platform analysis (2008–2018+) of 35 far-right YouTube channels (26K+ videos) vs. 219 mainstream media channels; identifies three creator clusters (Conservative, Alt-Lite, Alt-Right) with distinct ideological positioning and audience dynamics; demonstrates viewership peaked in 2017 and declined afterward, driven by supply-side affordances (low creation barriers, monetization, algorithmic recommendation) and demand-side audience appetite rather than algorithm amplification alone; reframes radicalization debate as supply-and-demand problem fundamental to platform affordance design.
Ribeiro, Veselovsky & West (2023) — The Amplification Paradox in Recommender Systems — agent-based model resolving the contradiction between algorithmic audits (showing recommendation amplification of extreme content) and real user logs (showing recommendations don't drive extreme consumption); demonstrates that collaborative filtering plus content nicheness alone explain both observations; argues audits must model user utility preferences to meaningfully assess radicalization and filter bubbles; challenges regulatory frameworks treating all algorithmic prominence as "amplification."
Stella, Ferrara & De Domenico (2018) — Bots increase exposure to negative and inflammatory content in online social systems — analysis of ~3.6M tweets during 2017 Catalan independence referendum; ~33% of active users are bots; shows bots from network periphery strategically target human influencers (Kendall τ=0.62, p<10⁻⁴); amplify group-specific inflammatory content (violent narratives for Independentists vs. neutral for Constitutionalists); demonstrate bot-driven social contagion reinforces and exacerbates human polarization.
Bail et al. (2020) — Assessing the Russian Internet Research Agency's impact on the political attitudes and behaviors of American Twitter users in late 2017 — rare causal assessment of IRA's political impact; longitudinal study (N=1,239 partisan Twitter users) finding no significant effect of IRA interaction on political attitudes (affective/ideological) or behaviors (engagement, network composition) over one month; identifies users with strong echo chambers and high political interest as most likely to interact with IRA, suggesting Russians may have failed to polarize because they targeted already-entrenched partisans.
Stukal et al. (2017) — Detecting Bots on Russian Political Twitter — large-scale bot detection on Russian political Twitter (2014–2015); ensemble supervised classifier achieves 95% precision and 87% recall; finds >50% of politically-active accounts are bots; software platform (Twitterfeed, ifttt.com) strongest bot predictor; bot activity spikes during Crimean annexation and opposition politician assassination (Nemtsov), suggesting coordinated propaganda use.
Golovchenko et al. (2020) — Cross-Platform State Propaganda: Russian Trolls on Twitter and YouTube during the 2016 U.S. Presidential Election — analysis of IRA propaganda strategy (1,052 accounts, 108,781 tweets); examines hyperlinks to news media and YouTube videos; finds IRA linked to conservative sources 66% of the time; identifies "pre-propaganda" strategy where liberal-leaning accounts build credibility before amplifying conservative content; demonstrates cross-platform propaganda exploitation.
Linvill & Warren (2020) — Troll Factories: Manufacturing Specialized Disinformation on Twitter — analysis of Russia's Internet Research Agency (IRA) operations on Twitter (2009–2018), identifying five specialized account categories (Right Troll, Left Troll, News Feed, Hashtag Gamer, Fearmonger) with distinct behavioral signatures; demonstrates IRA operated as coordinated "propaganda factory" with specialized units responding to political events (2016 election debates, Podesta email release); content analysis alone identifies account type in 83% of cases.
Giglietto et al. (2020) — It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections — detects coordinated inauthentic behavior via near-simultaneous link sharing on Facebook; empirical evidence that coordinated networks amplify problematic domains 1.79–2.22× more than uncoordinated entities; distinguishes political vs. non-political deceptive networks.
Lukito (2019) — Coordinating a Multi-Platform Disinformation Campaign: Internet Research Agency Activity on Three U.S. Social Media Platforms, 2015 to 2017 — empirical analysis of IRA's coordinated strategy across Facebook, Twitter, and Reddit (3,126 ads, 1.9M tweets, 12.6K posts); uses vector autoregression to demonstrate platform-aware temporal coordination; reveals Reddit as "trial balloon" for testing messages before Twitter amplification; shows IRA Twitter activity responds to Trump approval ratings.
Machado et al. (2019) — A Study of Misinformation in WhatsApp groups with a focus on the Brazilian Presidential Elections — empirical study of 130 public WhatsApp groups during 2018 Brazilian election; analyzes 45,072 URLs and 400 media files; develops grounded typology; finds 13.1% of links from junk news sources, 40% to YouTube; documents strategic use of messaging platforms for campaign propaganda outside public platform oversight.

Fringe communities and internet culture¶

Zannettou et al. (2018) — On the Origins of Memes by Means of Fringe Web Communities — large-scale empirical study of meme origins and propagation across Twitter, Reddit, /pol/, and Gab (160M images, 2016–2017); uses perceptual hashing and custom distance metrics to identify 12.6K meme clusters; employs Hawkes processes to quantify directed influence between communities; finds /pol/ and The_Donald substantially influence mainstream meme ecosystems despite modest size; documents disproportionate prevalence of hateful and anti-semitic memes on fringe communities.

LLM-generated text detection¶

Tang, Chuang & Hu (2023) — The Science of Detecting LLM-Generated Texts — comprehensive survey of black-box and white-box detection approaches for LLM-generated text, covering data collection, feature selection (statistical disparities, linguistic patterns, fact verification), classification models, watermarking strategies, benchmark datasets (HC3, Neural Fake News, etc.), and challenges including adaptive attacks, bias in training data, and threats from open-source LLMs.

Style / content-based detection¶

Wang & Chang (2022) — Toxicity Detection with Generative Prompt-based Inference — zero-shot toxicity detection using generative prompt-based classification; compares generative (estimating p(x|y)) vs. discriminative formulations; careful prompt engineering crucial for performance; demonstrates generative approach outperforms discriminative and embedding-similarity baselines on SBIC, HateExplain, and Civility datasets; qualitative analysis reveals LLMs sometimes rely on spurious correlations learned during pre-training.
Oshikawa, Qian, & Wang (2020) — A Survey on Natural Language Processing for Fake News Detection — comprehensive NLP survey systematically comparing task formulations (classification vs. regression), nine benchmark datasets (LIAR, FEVER, FakeNewsNet, SNS data), and five methodological approaches (preprocessing, ML models, rhetorical approaches, evidence collection); demonstrates attention-based LSTM models outperform hand-crafted linguistic features; achieves 41.5–45.7% on LIAR, 68–76% on FEVER, 94.4% on FakeNewsNet (with graph convolutional networks).
Singhania, Fernandez & Rao (2023) — 3HAN: A Deep Neural Network for Fake News Detection — three-level hierarchical attention network modeling articles at word, sentence, and headline-body levels. Word-level attention extracts relevant words; sentence-level attention identifies informative sentences; headline-body attention captures stance between headline and body. Achieves 96.77% with headline-based pre-training; provides interpretable attention visualizations showing which words and sentences drive fake news predictions.
Liu, Wang, Li & Li (2024) — TELLER: A Trustworthy Framework For Explainable, Generalizable and Controllable Fake News Detection — dual-system framework combining LLM-driven cognition system (decomposes claims into interpretable yes/no questions) with neural-symbolic decision system (learns transparent logic rules via disjunctive normal form); achieves 76% accuracy on GossipCop and 80%+ on three datasets while maintaining explainability, generalizability, and human controllability; demonstrates integration of human expertise with machine learning for trustworthy detection.
Potthast et al. (2017) — A Stylometric Inquiry into Hyperpartisan and Fake News — stylometric analysis via writing style features showing hyperpartisan news can be distinguished from mainstream (F1=0.78), satire from both (F1=0.81), but style-based fake news detection alone insufficient (F1=0.46); introduces Unmasking technique for assessing style similarity between text categories; corpus of 1,627 fact-checked articles from BuzzFeed.
Karimi & Tang (2019) — Learning Hierarchical Discourse-level Structure for Fake News Detection — proposes HDSF framework that automatically learns discourse-level dependency trees (hierarchical sentence organizations) via inter-sentential attention; identifies three structure-related properties distinguishing fake/real news: leaf node count (coherence), preorder difference (sentence ordering), parent-child distance (discourse cohesion); achieves 82.19% accuracy, outperforming linguistic baselines; real news documents exhibit statistically significant higher coherence in discourse structures.
Rashkin et al. (2017) — Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking — linguistic analysis of news across satire, hoax, propaganda, and trusted categories; demonstrates that fake news uses more first-person pronouns, superlatives, modal adverbs, and hedging while trusted news uses concrete language and assertive verbs; news reliability classification 65% F1; fine-grained truthfulness prediction on PolitiFact 6-point scale achieves 22% F1 (6-class) and 52% F1 (2-class).
Zhou et al. (2023) — Linguistic-style-aware Neural Networks for Fake News Detection — HERO hierarchical recursive neural network; constructs per-document linguistic trees integrating constituency and RST discourse structure; Bi-GRU aggregation preserves global tree topology; attribute-specific variant 0.866/0.896 AUC on Recovery/MM-COVID, outperforming HAN, Text-GCN, DRNN, and Transformer baselines.
Cao et al. (2025) — Is Less Really More? Fake News Detection with Limited Information — SLIM framework; replaces full article text with MMR-selected keywords, POS/NER sequence tags, or metadata; 30% keyword extraction achieves ~99% accuracy ratio vs. full text; XLNet_base backbone; 95.55%/97.60% accuracy on ReCOVery/Fake_And_Real_News.
Kaliyar, Goswami & Narang (2021) — FakeBERT: Fake News Detection in Social Media with a BERT-based Deep Learning Approach — combines BERT embeddings with parallel 1D convolutional neural networks using varying kernel sizes for multi-scale feature extraction; achieves 98.90% accuracy on real-world 2016 U.S. Presidential Election dataset, substantially outperforming CNN (92.70%) and LSTM (97.55%) baselines, demonstrating the effectiveness of contextualized transformer embeddings for fake news detection on social media.

Knowledge & fact-checking¶

Zhang et al. (2023) — How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances — Comprehensive survey of methods to keep LLMs aligned with ever-changing world knowledge without retraining; systematically categorizes implicit approaches (knowledge editing, continual learning) and explicit approaches (memory-augmented, retrieval-augmented, internet-enhanced); compares scalability, efficiency, and parameter-modification trade-offs; identifies open challenges in robust knowledge updates and unified evaluation benchmarks
Quelle & Bovet (2023) — The Perils & Promises of Fact-checking with Large Language Models — Evaluates GPT-3.5 and GPT-4 for fact-checking on PolitiFact and 16+ languages in Data Commons; demonstrates evidence retrieval via Google Search improves accuracy by 10–20 percentage points; reveals critical language-specific performance disparities with non-English claims underperforming when compared to English translations, suggesting fundamental training-data bias in multilingual verification
Wang & Shu (2023) — Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models — proposes FOLK, using first-order logic to decompose claims and LLMs for knowledge-grounded reasoning; leverages external retrieval to ground answers, making veracity predictions with explanations; achieves state-of-the-art on HoVER (54.80% F1 three-hop), FEVEROUS (67.01% F1 multi-hop), SciFactOpen (67.59% F1); explains why knowledge-grounding is more reliable than relying on LLM internal knowledge
Lewis et al. (2020) — Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — proposes RAG combining dense retrieval with seq2seq generation; evaluated on open-domain QA, abstractive QA, Jeopardy generation, and fact verification; retrieval-augmented models generate more factual and specific text than parametric-only baselines
Asai et al. (2023) — Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection — extends RAG by teaching language models to adaptively decide when to retrieve, and to generate reflection tokens for self-critique; achieves superior performance on PubHealth, PopQA, ASQA, and biography generation; demonstrates improved factuality and citation accuracy.
Dun et al. (2021) — KAN: Knowledge-aware Attention Network for Fake News Detection — incorporates entities and their knowledge graph contexts (immediate neighbors in Wikidata) via entity linking; applies News-towards-Entities (N-E) and News-towards-Entities-and-Entity-Contexts (N-E2C) attention mechanisms to weight entity importance; achieves 7.4% F1 improvement on PolitiFact, 2.8% on GossipCop, 9.7% on PHEME over prior knowledge-graph methods.
Mayank, Sharma & Sharma (2021) — DEAP-FAKED: Knowledge Graph based Approach for Fake News Detection — combines biLSTM news encoding with Wikidata knowledge graph embeddings (ComplEx) to detect fake news from titles alone; achieves 88% and 78% F1 on two datasets through systematic bias removal and entity-based feature integration.
Ciampaglia et al. (2015) — Computational fact checking from knowledge networks — knowledge-graph-based approach framing fact checking as shortest-path computation in Wikipedia; semantic proximity metric based on entity generality enables high-accuracy verification; demonstrates that graph topology carries truth signal independent of text content.
Nieminen & Rapeli (2019) — Fighting Misperceptions and Doubting Journalists' Objectivity: A Review of Fact-checking Literature — comprehensive literature review of 48 studies on political fact-checking, examining effectiveness in reducing misperceptions (mixed results including backfire effects), fact-checking as a profession (methodological inconsistencies), and public opinion about fact-checking; documents that 88% of studies are US-focused.
Graves (2016) — Boundaries Not Drawn: Mapping the institutional roots of the global fact-checking movement — ethnographic mapping of the global fact-checking movement's organizational landscape across journalism, academia, and politics/civil society axes; identifies institutional diversity and contested professional boundaries across countries.

Evidence-based detection¶

Popat et al. (2018) — DeClareE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning — end-to-end neural network combining claims and evidence articles via bidirectional LSTMs with claim-specific attention; automatically discovers which web articles support or refute a claim without hand-crafted features; 78.96% accuracy on Snopes, demonstrating the value of external evidence for credibility assessment.
Vo & Lee (2021) — Hierarchical Multi-head Attentive Network for Evidence-aware Fake News Detection — proposes MAC with hierarchical multi-head attention at both word and document levels; word-level attention identifies important phrases in claims and evidence; document-level attention weights evidence sources by relevance; jointly optimized with BiLSTM embeddings; achieves 88.7% AUC on Snopes (9.47% improvement over baselines) and 75.8% on PolitiFact; ablation studies demonstrate both attention levels are essential for evidence-aware fact-checking.
Jin et al. (2021) — Towards Fine-Grained Reasoning for Fake News Detection — constructs claim-evidence graphs from social media (posts, users, keywords) and uses mutual-reinforcement-based ranking to identify salient evidence; proposes bi-channel kernel graph attention network integrating textual and social signals for fine-grained reasoning; achieves 91.7% F1 on PolitiFact and 86.4% F1 on GossipCop with interpretable explanations of which evidence groups matter most for each prediction.

Multimodal¶

Jagtap et al. (2021) — Misinformation Detection on YouTube Using Video Captions — applies pre-trained word embeddings (GloVe, Word2Vec) to YouTube video captions for three-class (Misinformation, Debunking, Neutral) and binary classification; achieves 0.85–0.90 F₁ (three-class) and 0.92–0.95 F₁ (binary); demonstrates that video metadata (views, likes) alone insufficient but caption analysis with classical ML classifiers outperforms baselines across five conspiracy topics (vaccines, 9/11, chemtrails, moon landing, flat earth).
Alam et al. (2021) — A Survey on Multimodal Disinformation Detection — comprehensive survey of multimodal disinformation covering text, images, speech, video, network structure, and temporal information; distinguishes factuality (content falsity) from harmfulness (intent to deceive/harm); systematically reviews ~140 papers on detection approaches and identifies key challenges in combining multiple modalities.
Nakamura, Levy & Wang (2019) — r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection — large-scale multimodal dataset with 1.06M Reddit submissions (64% text+image) labeled for 2-way, 3-way, and 6-way classification; demonstrates multimodal models (BERT + ResNet50) achieve 85.88% 6-way accuracy, ~10 percentage points above text-only baselines; identifies satire and imposter content as hardest categories.
Wang et al. (2018) — EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection — adversarial learning to remove event-specific features and learn event-invariant representations; feature extractor cooperates with fake news detector and tries to fool event discriminator; 71.5% / 82.7% accuracy on Twitter / Weibo; first to formulate fake news detection on newly emerged events as a transfer learning problem.
Yang et al. (2018) — TI-CNN: Convolutional Neural Networks for Fake News Detection — parallel CNN branches for text and image; extracts explicit features (word counts, punctuation, capital letters, negations, pronouns, face count, resolution) and learns latent representations; concatenates both feature sets for classification; achieves F₁ 0.9210 on 2016 US presidential election news, significantly outperforming text-only (0.8920) and image-only (0.4729) approaches.
Khattar et al. (2019) — MVAE: Multimodal Variational Autoencoder for Fake News Detection — variational autoencoder learning shared text-image representations; jointly trains encoder-decoder (VAE reconstruction) with binary classifier; 74.5% / 82.4% accuracy on Twitter / Weibo, improving ~6% over attention-based baselines by explicitly modeling cross-modal correlations.
A Multi-Modal Method for Satire Detection using Textual and Visual Cues: Multi-modal satire detection using ViLBERT (Vision & Language BERT) on headline-image pairs from satirical and mainstream news sources; achieves 93.80% accuracy on 10,000-article dataset; demonstrates that early fusion and multi-modal pre-training outperform uni-modal and simple fusion baselines; notably, image forensics (ELA+CNN) alone underperforms, highlighting importance of joint reasoning.
Zhou et al. (2020) — SAFE: Similarity-Aware Multi-Modal Fake News Detection — proposes cross-modal text-image similarity as a detection signal; modified cosine similarity between Text-CNN text and image2sentence visual representations; F₁ 0.896/0.895 on PolitiFact/GossipCop, outperforming text-only and prior multi-modal baselines.
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News: First to address defending against machine-generated multimodal fake news with images and captions. Proposes DIDAN, a named entity-based approach to detect visual-semantic inconsistencies by measuring named entity co-occurrence between article text and image captions. Introduces NeuralNews dataset of 128K articles across four types (real/generated articles × real/generated captions). Shows naive humans achieve only 46.2% detection accuracy while trained humans with visual-semantic cues reach 67.8%; identifies Type C (generated text + real images) as most deceptive.
Silva et al. (2021) — Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multimodal Data — addresses practical problem that multimodal models trained on one domain fail on others (politics→entertainment, politics→COVID-19); unsupervised domain discovery via propagation network community detection; supervised domain-agnostic classifier preserves both domain-specific and cross-domain knowledge via dual decoders with adversarial loss; LSH-based instance selection reduces labeling cost; 7.55% F₁ improvement on rarely-appearing domains; 0.836–0.869 F₁ across PolitiFact/GossipCop/CoAID.
Zhou et al. (2020) — ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research — multimodal COVID-19 news credibility dataset; 2,029 articles from 60 screened publishers with NewsGuard/MBFC labels, 140,820 tweets; benchmarks LIWC, RST, Text-CNN, and SAFE with SAFE achieving best F₁ 0.833/0.672.
Yang et al. (2020) — CHECKED: Chinese COVID-19 Fake News Dataset — first Chinese-language COVID-19 misinformation dataset; 2,104 Weibo microblogs with per-item expert labels, multimedia, and 1.87M repost/1.19M comment propagation graphs; TextCNN baseline macro F₁ = 0.938.

Baly et al. (2019) — Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media — applies Copula Ordinal Regression to jointly model outlet trustworthiness (3-point scale) and political ideology (7-point scale); auxiliary tasks at different bias granularities (binary, 3-point, 5-point) reduce prediction error; factuality MAE 0.481, bias MAE 1.479; demonstrates that extreme bias and low factuality are correlated.
Nakov et al. (2021) — A Survey on Predicting the Factuality and the Bias of News Media — comprehensive survey of media-level profiling for source reliability and political bias prediction; reviews textual features (linguistic markers, embeddings), multimedia analysis (image/video forensics), audience homophily (follower ideology), and infrastructure characteristics (domain, hosting); argues for joint factuality-bias modeling; documents that bias prediction is easier than factuality because factuality requires external ground-truth verification.
Shu, Wang & Liu (2019) — Beyond News Contents: The Role of Social Context for Fake News Detection — tri-relationship embedding framework jointly modeling publisher-news relations and user-news interactions; incorporates publisher partisan bias and user credibility; achieves 4–6% F1 improvement over content+profile baselines on FakeNewsNet; demonstrates early detection (>80% F1 within 48 hours).
Eady et al. (2019) — How Many People Live in Political Bubbles on Social Media? Evidence From Linked Survey and Twitter Data — links survey data (N=1,496) to Twitter behavioral data (642K accounts, 1.2B tweets); measures ideological exposure through following and retweet patterns; finds substantial cross-ideological overlap, contradicting "filter bubble" narratives; only 34% of most-conservative quintile live in extreme partisan bubbles; retweets and weak ties provide important bridges across ideological divides; documents asymmetries with conservatives more likely to follow left-leaning outlets.
Grinberg et al. (2019) — Fake news on Twitter during the 2016 U.S. presidential election — individual-level Twitter analysis linked to voter registration; extreme concentration of fake news consumption (1% of users see 80%) driven by older, conservative, politically engaged subset; identifies distinct media ecosystem cluster for fake news with high internal density; shows mainstream sources still dominant even for extreme-right users.
Mosleh et al. (2021) — Shared partisanship dramatically increases social tie formation in a Twitter field experiment — Twitter field experiment with bot accounts; users 3× more likely to reciprocally follow copartisans (β=0.093, p<0.001); no partisan asymmetry (Democrats and Republicans equally homophilous); demonstrates partisan preference as intrinsic driver of echo chamber formation, not purely algorithmic; n=842, preregistered.
Mosleh et al. (2021) — Cognitive reflection correlates with behavior on Twitter — field study linking Cognitive Reflection Test scores to actual Twitter behavior; high-CRT users are more discerning: follow fewer accounts, share higher-quality news sources, show cognitive echo chambers independent of partisanship; N=1,901 from Prolific with Twitter API data.
Shu et al. (2019) — The Role of User Profiles for Fake News Detection — characterizes Twitter user profiles (explicit + implicit) of fake vs. real news sharers; UPF vector outperforms RST and LIWC content baselines on FakeNewsNet.
Shu et al. (2019) — dEFEND: Explainable Fake News Detection — jointly encodes news sentences and user comments via hierarchical attention with sentence-comment co-attention; outperforms content-only (HPA-BLSTM) and comment-only baselines; demonstrates complementarity of news and social signals for both detection and explainability.
Sitaula et al. (2019) — Credibility-based Fake News Detection — frames detection as credibility assessment; 3 source features (author count + publication history) achieve F1-macro 0.77–0.83, outperforming 23 content features (0.68) on PolitiFact + BuzzFeed.

Susceptibility factors and public health behavior¶

Suarez-Lledo & Alvarez-Galvez (2021) — Prevalence of Health Misinformation on Social Media: Systematic Review — systematic review of 69 studies characterizing health misinformation topics and their prevalence across social media platforms. Vaccines (32% of studies), drugs/smoking (22%), noncommunicable diseases (19%), pandemics (10%), eating disorders (9%), and medical treatments (7%) emerge as principal categories; health misinformation most prevalent on Twitter; provides standardized measurement framework and identifies key analytical approaches for future research.
Jones-Jang, Mortensen & Liu (2021) — Does Media Literacy Help Identification of Fake News? Information Literacy Helps, but Other Literacies Don't — national survey (N=1,299) testing four literacy types (media, information, news, digital) as predictors of fake news identification ability. Only information literacy—the ability to navigate, search, and evaluate verified information sources—significantly predicts accurate identification; media, news, and digital literacies showed no significant relationship, challenging common assumptions about media literacy interventions.
Vraga & Tully (2021) — News literacy, social media behaviors, and skepticism toward information on social media — national survey (N=788) examining how news literacy, cognitive reflection, self-perceived media literacy, and value for media literacy relate to exposure and posting of political content on social media, and whether they predict skepticism toward information encountered in these spaces; finds actual knowledge (news literacy) predicts less engagement but greater skepticism, while self-assessed literacy shows opposite pattern; platform-specific effects with news literacy suppressing engagement on Facebook and YouTube but not Twitter.
Roozenbeek et al. (2020) — Susceptibility to misinformation about COVID-19 around the world — international survey of 3,750 adults across five countries (UK, Ireland, USA, Spain, Mexico) examining psychological and cognitive predictors of COVID-19 misinformation susceptibility. Higher trust in scientists and numeracy predict lower misinformation belief; susceptibility predicts vaccine hesitancy and reduced health-guidance compliance across all countries.

Moral and behavioral psychology of misinformation¶

Effron & Raj (2020) — Misinformation and Morality: Encountering Fake-News Headlines Makes Them Seem Less Unethical to Publish and Share — four preregistered experiments (N=2,587) demonstrating that repeated exposure to fake-news headlines reduces moral condemnation of spreading them independent of belief; fluency-driven moral desensitization operates via "familiar = acceptable" intuition; effect persists when headlines labeled false and mediates downstream sharing intentions on social media.
Brady et al. (2017) — Emotion shapes the diffusion of moralized content in social networks — 563,312 tweets across gun control, same-sex marriage, climate change; moral-emotional words increase retweet rate ~20% per word; effect stronger within ideological in-groups than out-groups, explaining "echo chamber" polarization.

Adversarial & defensive¶

Mazurczyk, Lee & Vlachos (2023) — Disinformation 2.0 in the Age of AI: A Cybersecurity Perspective — position paper proposing that disinformation should be treated as a cybersecurity threat, especially as AI enables large-scale creation and dissemination; introduces four attack scenarios (AI-generated content, perturbation, detection evasion, optimized spread) and proposes defense-in-depth countermeasures across social network, ISP, device, and user layers
Hameleers et al. (2020) — A Picture Paints a Thousand Lies? — experimental evidence (N=1,404) that multimodal disinformation (text+image Twitter posts) is perceived as more credible than text-only falsehoods, but fact-checkers effectively reduce credibility regardless of modality; motivated reasoning moderates effectiveness—fact-checkers most persuasive when reaching those already skeptical of the false claim.
Walter et al. (2020) — Fact-Checking: A Meta-Analysis of What Works and for Whom — meta-analysis of 30 studies quantifying fact-checking effectiveness (d = 0.29) and identifying key moderators including motivated reasoning, political ideology, message design, and context; pro-attitudinal corrections much more effective than counter-attitudinal ones; visual elements often backfire; campaign messaging harder to correct.
Krause et al. (2020) — Fact-checking as risk communication: the multi-layered risk of misinformation in times of COVID-19 — applies 40+ years of risk communication research to COVID-19 fact-checking; argues that fact-checking fails without trust, and that competing definitions of misinformation risk across political groups undermine fact-checker credibility; recommends trust-building partnerships, transparency on uncertainty, and value-aligned messaging.
Pennycook et al. (2020) — Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention — two pre-registered experiments (1,700+ participants) demonstrating that simple accuracy-salience nudge (rating an unrelated headline for accuracy) nearly triples truth discernment when deciding what to share; intervention works uniformly across partisanship, education, and geographic proximity to COVID-19 epicenters.
Bode & Vraga (2015) — In Related News, That Was Wrong: The Correction of Misinformation Through Related Stories Functionality in Social Media — experimental test of Facebook's "related stories" feature for correcting health misinformation; shows algorithmic curation can reduce misperceptions when debunking matches user's initial position, with effects moderated by motivated reasoning

Early detection¶

Zhou et al. (2020) — Fake News Early Detection: An Interdisciplinary Study — theory-driven feature engineering at lexicon-, syntax-, semantic-, and discourse-level; ≈89% accuracy on PolitiFact/BuzzFeed with content-only features; outperforms propagation-based and hybrid baselines; reveals empirical links between fake news, deception, and clickbait.

Explainability¶

Danilevsky et al. (2020) — A Survey of the State of Explainable AI for Natural Language Processing in arXiv — comprehensive survey of explainability in NLP across 50 papers; categorizes explanations by local/global scope and self-explaining/post-hoc generation; details five core techniques (feature importance, surrogate models, example-driven, provenance-based, induction); reviews visualization methods (saliency, raw declarative, natural language); identifies evaluation gaps and future directions; foundational reference for interpretable NLP systems including fake-news detectors.
Shu et al. (2019) — dEFEND: Explainable Fake News Detection — hierarchical attention networks jointly encoding news content and user comments; sentence-comment co-attention identifies which sentences and comments drive the fake-news prediction; 0.904 accuracy on PolitiFact; human evaluation demonstrates dEFEND ranks check-worthy sentences better than HPA-BLSTM.

Cross-domain & transfer¶

Nan et al. (2021) — MDFEND: Multi-domain Fake News Detection — Introduces Weibo21, the first multi-domain fake news dataset from a single platform with 9 domains; proposes MDFEND using mixture-of-experts with domain gate to adaptively aggregate representations across domains; achieves 0.9137 F₁, outperforming single-domain and multi-domain baselines; directly addresses domain shift in linguistic patterns and propagation behavior.

LLMs & generative-era¶

Real-world GenAI misuse and threat assessment: - Marchal et al. (2024) — Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data — First empirical taxonomy of real-world GenAI misuse based on 191 documented media incidents (Jan 2023–Mar 2024); identifies 18 distinct tactics across exploitation of GenAI capabilities (impersonation, falsification, content scaling) and technical system attacks; finds most misuse is low-tech and accessible, driven by five goals (opinion manipulation 27%, monetization 21%, fraud 18%, harassment 6%, reach 3.6%); demonstrates GenAI has democratized previously costly tactics for broader pools of actors with minimal technical expertise.

Empirical detection studies: - Su, Cardie & Nakov (2023) — Adapting Fake News Detection to the Era of Large Language Models — Comprehensive evaluation of fake news detectors across three stages: human-written dominance (Human Legacy), mixed human-machine (Transitional Coexistence), and machine-generated dominance. Key finding: detectors trained exclusively on human-written fake news generalize poorly to machine-generated fakes. Recommends training on balanced human-machine data to improve robustness. Benchmarks RoBERTa, BERT, ELECTRA, ALBERT, DeBERTa on GossipCop++ and PolitiFact++ datasets; reveals data distribution shifts caused by LLMs create asymmetric generalization challenges. - Dugan et al. (2022) — Real or Fake Text?: Boundary Detection — Investigates human ability to detect transition points where text shifts from human-written to machine-generated (boundary detection). Introduces RoFT game platform; 21,000+ annotations across four genres show humans achieve 23.4% on first guess (vs. 10% random) and 72.3% with top-3 guesses; larger models harder to detect; genre-specific error patterns; monetary incentives improve learning. - Can LLM-Generated Misinformation Be Detected? — Empirical evidence that LLM-generated misinformation is harder to detect for humans (9.6% vs 40.7% success) and detectors than human-written content with same semantics; builds taxonomy and LLMFake dataset.

Comprehensive surveys: - Combating Misinformation in the Age of LLMs: Opportunities and Challenges — Systematic review of both opportunities (detection, intervention, attribution) and challenges (hallucination, intentional generation) for using LLMs in misinformation research; examines domain-specific threats and countermeasures.

Model evaluation and understanding:

Model alignment and instruction-following: - Askell et al. (2021) — A General Language Assistant as a Laboratory for Alignment — interactive evaluation framework for alignment using helpfulness, honesty, and harmlessness (HHH) criteria; compares prompting, imitation learning, binary discrimination, and ranked preference modeling; finds ranked preference modeling scales better than imitation learning; introduces preference model pre-training (PMP) on public data to improve sample efficiency.

Disinformation generation and safety: - Toxicity in ChatGPT: Analyzing Persona-assigned Language Models — Large-scale systematic analysis of persona-induced toxicity in ChatGPT; shows safety mechanisms can be bypassed via system parameter manipulation; 6× toxicity increase possible; reveals discriminatory bias targeting certain demographic groups, countries, and entity categories - Vykopal et al. (2023) — Disinformation Capabilities of Large Language Models — Comprehensive empirical study of 10 LLMs' ability to generate disinformation news articles across 20 narratives (COVID-19, Russia-Ukraine, health, elections); most models readily agree with dangerous claims; Falcon is sole exception with effective safeguards; ChatGPT shows behavioral safety; existing detectors achieve ~0.8 F1 but struggle per-sample.

Truthfulness and hallucination evaluation: - Ji et al. (2022) — Survey of Hallucination in Natural Language Generation — comprehensive survey of hallucination across six major NLG tasks (abstractive summarization, dialogue generation, QA, data-to-text generation, machine translation, vision-language generation) and LLMs; defines intrinsic and extrinsic hallucinations; reviews metrics (statistical, model-based, human evaluation) and mitigation methods (architecture, training, post-processing, controllable generation); identifies task-specific tolerance differences and open challenges. - Lin, Hilton & Evans (2021) — TruthfulQA: Measuring How Models Mimic Human Falsehoods — benchmark demonstrating larger language models are less truthful; proposes automated metric for evaluating factual accuracy.

Robustness of detectors and adversarial attacks: - Sadasivan et al. (2023) — Can AI-Generated Text be Reliably Detected? — Comprehensive stress-testing of four detector classes (watermarking, neural network-based, zero-shot, retrieval-based) using recursive paraphrasing attacks; reduces watermark detector AUROC from 99.8% to 80.7%, and retrieval-based detectors below 60% accuracy with only modest text quality degradation; establishes theoretical bound on detector AUROC via total variation distance between text distributions, revealing fundamental hardness as LLMs improve. - Mao et al. (2024) — RAIDAR: Generative AI Detection via Rewriting — Detection method leveraging rewriting behavior: LLMs preserve their own generated text while modifying human-written text when asked to rewrite. Measures three structural properties (invariance, equivariance, output uncertainty) from editing distance without requiring internal model access. Achieves 60–95 F1 across diverse domains (news, essays, code, reviews, arXiv) with 29-point improvements over prior methods; robust to adversarial rephrasing attacks even when adversaries know the detection mechanism (up to 93 F1).

Zero-shot detection methods: - Mitchell et al. (2023) — DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature — Identifies that LLM-generated text occupies negative-curvature regions of the log-probability landscape; zero-shot method using random perturbations to estimate Hessian trace; achieves 0.95 AUROC on GPT-2 detection without training data or access to model parameters.

Generation attacks and detection: - Adelani et al. (2019) — Generating Sentiment-Preserving Fake Online Reviews — Demonstrates practical attack using fine-tuned GPT-2 to generate high-quality product reviews; two-step approach (generation + BERT validation) preserves sentiment; shows humans and automated detectors (Grover, GLTR, OpenAI detector) struggle to distinguish generated from authentic reviews; sentiment preservation rates 67–71% with fine-tuning. - Solaiman et al. (2019) — Release Strategies and the Social Impacts of Language Models — OpenAI's report on GPT-2 staged release (124M–1.5B parameters, Feb–Nov 2019) and responsible AI publication norms. Evaluates human credibility perception of synthetic text (~75% for largest models), automated detection (RoBERTa ~95% accuracy), biases in outputs (gender, religion, language preference), and threat landscape. Conducted partnership-based risk analysis with external institutions (Cornell, Middlebury CTEC, University of Oregon, University of Texas Austin). Foundational for understanding staged release strategies and detecting GPT-2 generated fake news. - Gehrmann, Strobelt & Rush (2019) — GLTR: Statistical Detection and Visualization of Generated Text — interactive tool for detecting AI-generated text by analyzing language model output distribution; three statistical tests (word probability, token rank, entropy) reveal that generated text concentrates on high-rank tokens while humans use wider vocabulary; human-subjects study shows visual interface improves fake-text detection from 54% to 72% accuracy; widely deployed at gltr.io. - Ippolito et al. (2019) — Automatic Detection of Generated Text is Easiest when Humans are Fooled — empirical study contrasting human and automatic detection of GPT-2-generated text across three decoding strategies (top-k, nucleus sampling, untruncated random). Fine-tuned BERT achieves 80%+ accuracy on long (192-token) excerpts versus 71% for trained human raters. Critical finding: detectors trained on one strategy transfer poorly to others (42.5% accuracy drop when trained on top-k and tested on nucleus), whereas humans remain robust. Reveals asymmetric difficulty: detection is easiest when humans are fooled (nucleus sampling) but hardest when humans remain reliable (top-k). - Clark et al. (2021) — All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text — large-scale study (1,170 Amazon Mechanical Turk evaluators) examining humans' ability to assess machine-generated text from GPT2 and GPT3 across three domains (stories, news, recipes). Untrained evaluators achieve 57% accuracy on GPT2 and 50% (chance) on GPT3; three training interventions (instructions, examples, comparisons) improve accuracy modestly, with example-based training reaching 55% overall. Reveals evaluators focus on surface-level features (grammar, spelling, style) rather than content; provides evidence that consistent human evaluation methodology is critical for benchmarking NLG quality and detection systems.

Misinformation generation and ODQA poisoning: - On the Risk of Misinformation Pollution with Large Language Models — Investigates LLM-generated misinformation threat to ODQA systems; demonstrates GPT-3.5 can generate credible false passages that degrade retrieval-based QA performance (14–87% EM drop); proposes misinformation detection, vigilant prompting, and reader ensemble defenses

Real-world impacts & polarization¶

Garimella et al. (2017) — Quantifying Controversy on Social Media — three-stage graph-based pipeline for measuring controversy in social media discussions via conversation topology; proposes multiple network-structure metrics with random-walk-based approach (RWC) most reliably separating controversial from non-controversial topics; validates on Twitter and external datasets.
Garimella et al. (2016) — Reducing Controversy by Connecting Opposing Views — algorithmic approach to mitigating echo chambers and polarization via graph-based edge recommendation; uses RWC metric to identify which edges to add; efficient algorithm (ROV) focuses on high-degree hub nodes; extends to ROV-AP incorporating acceptance probability based on user polarity; empirical validation on 10 Twitter controversy datasets.
Cinelli et al. (2021) — The echo chamber effect on social media — comparative analysis of 100+ million posts across Twitter, Facebook, Reddit, and Gab; quantifies echo chambers via homophily in interaction networks and bias in information diffusion; shows platform architecture (feed algorithms vs. community-based curation) determines whether homophilic clustering and polarized diffusion emerge; Facebook and Twitter exhibit strong echo chambers while Reddit shows reduced segregation despite polarization.
Bail et al. (2018) — Exposure to opposing views on social media can increase political polarization — field experiment on Twitter showing that repeated exposure to opposing political ideology can increase polarization (backfire effect), particularly for Republicans; challenges assumption that "breaking echo chambers" reduces polarization.
Soares, Recuero & Zago (2018) — Influencers in Polarized Political Networks on Twitter — social network analysis of Twitter conversations during Brazil's 2016 impeachment process; identifies three influencer types (opinion leaders, informational influencers, activists) and shows that user behavior—especially activist retweeting of in-group messages—actively reinforces echo-chamber structure and polarization beyond algorithmic curation.
Wilson & Wiysonge (2020) — Social media and vaccine hesitancy — large-scale cross-national study (137–166 countries) demonstrating causal links between social media activity and public health outcomes; social media organization for offline action predicts vaccine safety skepticism (cross-sectional); foreign disinformation campaigns associated with 2-percentage-point drop in vaccination coverage year-over-year; 15% increase in negative vaccine tweets per point on disinformation scale
Mills et al. (2023) — Engagement, User Satisfaction, and the Amplification of Divisive Content on Social Media — pre-registered field experiment comparing Twitter's engagement-based ranking algorithm with reverse-chronological and stated-preference baselines; finds engagement-based ranking amplifies partisan (0.24 SD), emotionally charged, and out-group hostile content beyond what users report preferring; proposes and evaluates stated-preference ranking that reduces harmful amplification while maintaining engagement and satisfaction.

Authors¶

See all authors — sorted alphabetically.

Hunt Allcott (New York University)
Lieke Bos (University of Amsterdam)
Megan A. Brown (New York University)
Sinan Aral (MIT)
Gabriele Ballarin (Independent researcher)
Cody Buntain (New Jersey Institute of Technology)
Carlos Castillo (Yahoo! Research)
Matteo Cinelli (Ca'Foscari University of Venice)
Luca de Alfaro (UC Santa Cruz)
Gianmarco De Francisci Morales (ISI Foundation)
Marco L. Della Vedova (Università Cattolica, Brescia)
Rebecca Lewis (Data & Society)
Darren L. Linvill (Clemson University)
Alice Marwick (University of North Carolina, Chapel Hill)
Marcelo Mendoza (Yahoo! Research Latin America, Chile)
Stefano Moret (École Polytechnique Fédérale de Lausanne)
Zhaoyang Cao (Syracuse University)
Fabio Giglietto (University of Urbino Carlo Bo)
Giada Marino (IT University of Copenhagen)
Barbara Poblete (Yahoo! Research Latin America, Chile)
Nicola Righetti (University of Urbino Carlo Bo)
Luca Rossi (IT University of Copenhagen)
Hossein Derakhshan (Independent researcher)
Gregory Eady (University of Copenhagen)
Justin Farrell (Yale University)
Alessandro Galeazzi (University of Brescia)
Caroline Jack (Data & Society)
Atishay Jain (Syracuse University)
Emilio Ferrara (University of Southern California)
Matthew Gentzkov (Stanford University)
Lucas Graves (University of Alabama)
Yevgeniy Golovchenko (University of Copenhagen)
Michael Hameleers (University of Amsterdam)
Jacob L Nelson (Arizona State University)
Thomas E. Powell (TNO, Netherlands)
Eugenio Tacchini (Università Cattolica, Piacenza)
Harsh Taneja (University of Illinois Urbana-Champaign)
Jiayu Li (Syracuse University)
Qinzhou Li (Google)
Chen Yang (Syracuse University)
Jennifer Grygiel (Syracuse University)
Huan Liu (Arizona State University)
Chilukuri K. Mohan (Syracuse University)
Apurva Mulay (Syracuse University)
John Nguyen (Syracuse University)
Vir V. Phoha (Syracuse University)
Walter Quattrociocchi (Sapienza University of Rome)
Jon Roozenbeek (University of Cambridge)
Deb Roy (MIT)
Kai Shu (Arizona State University)
Niraj Sitaula (Syracuse University)
Michele Starnini (ISI Foundation)
Toni G.L.A. Van Der Meer (University of Amsterdam)
Soroush Vosoughi (MIT)
Suhang Wang (Penn State University)
Claire Wardle (First Draft; Harvard Kennedy School)
Sander van der Linden (University of Cambridge)
Patrick L. Warren (Clemson University)
Jindi Wu (Syracuse University)
Reza Zafarani (Syracuse University)
Xinyi Zhou (Syracuse University)
Nathan Walter (Northwestern University)
Jonathan Cohen (University of Southern California)
R. Lance Holbert (University of Wisconsin-Madison)
Yasmin Morag (University of Haifa)

Topics¶

See all topics — research themes and methods.

Misinformation in diaspora and immigrant communities — misinformation targeting immigrant communities through messaging apps; platform moderation inequality; language-specific fact-checking gaps.
Audience analysis and measurement
Backfire effects and hostile media perception
Bot detection
Computational social science and large-scale text analysis
Content-based fake news detection
Coordinated inauthentic behavior
Corporate influence on scientific discourse and policy
Correction effectiveness
COVID-19 misinformation and the infodemic
Credibility assessment for fake news detection
Cross-lingual detection and transfer learning — training on one language, testing on others; transfer learning for low-resource languages.
Disinformation
Election interference and information warfare
Facebook and disinformation
Fact-checking and corrections
Generated text detection — detecting machine-generated versus human-written text using statistical and discriminative methods.
Fake news
Fake news audience and consumption
Fake news identification
Gab
Feature engineering for fake news detection
Information operations
Internet Research Agency
Information literacy
Linguistic style detection
Literacy interventions
Media literacy
Media manipulation
Medical NLP
Message design and persuasion
Meta-analysis and systematic reviews
Misinformation spread and diffusion
Multilingual fake news detection — parallel datasets and models across multiple languages.
Multimodal fake news detection
Multi-document summarization
News consumption patterns
Online subcultures
Political bias in fake news detection
Political communication
Polarization
Variational autoencoders
Propaganda
Propagation-based fake news detection
Reddit
Russian disinformation and state-sponsored information operations
Social bots
Social-context-based fake news detection
Social media and misinformation
State-sponsored information operations
Strategic communication
Terminology and conceptual frameworks
User profiles for fake news detection
Vaccine hesitancy and vaccine safety concerns

Datasets¶

See all datasets.

NELA-GT-2022 — 1.78M news articles from 361 sources (2022); source-level MBFC labels (factuality 0–5, conspiracy/pseudoscience); 346K embedded tweets; fifth NELA release with stabilized collection; SQLite and JSON formats.
NELA-GT-2018 — 713K news articles from 194 sources (2018); engagement-independent collection; source-level labels from 8 assessment sites (NewsGuard, Pew Research, Wikipedia, OpenSources, MBFC, AllSides, BuzzFeed, PolitiFact); multi-dimensional ground truth.
NELA-GT-2019 — 1.12M news articles from 260 sources (2019); source-level labels from 7 assessment sites (MBFC, AllSides, PolitiFact, etc.); 3-class aggregated reliability label; SQLite and JSON formats.
NELA-GT-2020 — 1.78M news articles from 519 sources (2020); source-level MBFC labels; novel embedded tweets feature (410K tweets); covers COVID-19 and 2020 U.S. election; SQLite and JSON formats.
MM-COVID — 3,981 fake news pieces in six languages (English, Spanish, Portuguese, Hindi, French, Italian) with 7,192 tweets; multilingual and multimodal COVID-19 dataset enabling cross-lingual detection research.
FakeNewsNet — PolitiFact + GossipCop; news content + Twitter social context; labels from professional fact-checkers.
ReCOVery — 2,029 COVID-19 news articles from 60 publishers; publisher-level NewsGuard/MBFC credibility labels; 140,820 tweets; multimodal (text, image, social).
CHECKED — 2,104 Weibo microblogs with per-item expert labels (344 fake, 1,760 real); Chinese COVID-19; includes images, video, and full propagation threads.
Weibo21 — 9,128 Weibo microblogs (4,488 fake, 4,640 real) across 9 domains (Science, Military, Education, Disasters, Politics, Health, Finance, Entertainment, Society); first multi-domain fake news dataset from a single platform; addresses domain shift for cross-domain detection.
Fake And Real News — 10,558 English news articles (binary fake/real); fake articles from 2016 Kaggle election dataset; real articles from AllSides/major outlets; 50% null accuracy.

Tools & libraries¶

See all tools (populated by ingest workflow)

Videos & talks¶

See all videos (populated by ingest workflow)

Misinformation and Data Literacy — data literacy as a defense against misinformation; chart manipulation in mainstream media; educational interventions (Calling Bullshit); emerging synthetic-media threats.
Misinformation in Diaspora Communities — messaging-app misinformation in immigrant communities; platform moderation gaps; language-specific fact-checking inequality.