Deepfakes¶

Deepfakes are synthetic videos or audio produced using machine learning techniques (primarily generative adversarial networks, or GANs) that convincingly depict a person saying or doing something they did not actually say or do. The term combines "deep learning" and "fake," reflecting the AI technology underlying their creation.

Deepfakes represent a new frontier in misinformation and video-based disinformation because they exploit the psychological power of visual evidence—people are more likely to believe video than text or still images—while removing the traditional requirement that misinformation creators have physical access to subjects or sophisticated video production resources.

Technical mechanisms¶

Generative Adversarial Networks (GANs): Two neural networks compete—a generator creates synthetic video frames, while a discriminator tries to distinguish them from real ones. This adversarial loop produces increasingly realistic results.

Face-swap techniques: Identity of one person is transferred onto the face of another in video, preserving lip-sync and facial expressions.

Voice synthesis: Deepfake audio uses neural text-to-speech or voice-conversion models to generate realistic speech that matches a speaker's acoustic patterns.

Self-reenactment: A speaker's facial expressions and head movements from one video are transferred to another video, enabling realistic video editing without identity replacement.

Why deepfakes matter to misinformation research¶

Lower barriers to creation: Unlike traditional video forgery (which required expertise in video editing), deepfakes can be created with limited technical skill using publicly available tools and training datasets.
Visual persuasion: Video is inherently more persuasive than text or images. The saying "seeing is believing" reflects people's tendency to treat video as direct evidence of truth. This makes deepfake video particularly dangerous.
Realism heuristic: People judge credibility partly on how realistic content appears. Deepfakes that exploit this by depicting already-known public figures (whose appearance is familiar) are more believable than deepfakes of unknown people.
Uncertainty amplification: Even when not fully believed, deepfakes create uncertainty ("Did this really happen?"), which erodes trust in institutions and media (see Vaccari & Chadwick (2020)).
Political weaponization potential: Deepfakes of political leaders could spread during elections, creating uncertainty about authentic statements and damaging trust in democratic discourse.

Types of deepfakes¶

Political deepfakes: Synthetic videos of politicians saying compromising statements; the Obama/Peele deepfake circulated widely on social media in 2018
Non-consensual intimate deepfakes: Synthetic sexual imagery of real people, predominantly targeting women; a form of harassment and abuse
Fraudulent deepfakes: Voice cloning and video synthesis for financial fraud (e.g., deepfake audio of a CEO authorizing a wire transfer)
Manipulated media: Realistic but subtle edits (lip-sync manipulation, expression transfer) that fall short of full deepfakes but still mislead

Detectability and detection methods¶

Humans are generally poor at detecting deepfakes by eye. Research shows: - Wang et al. found that humans correctly identify deepfakes only ~50% of the time—statistically indistinguishable from random guessing - Compression artifacts, eye blinks, facial symmetry, and audio-visual asynchrony can indicate deepfakes, but these signals are increasingly subtle in newer generation models

Automated detection methods include: - Frequency-domain analysis: Deepfakes often exhibit artifacts in Fourier space due to GAN compression - Behavioral inconsistencies: Unnatural eye movement, facial expression sequences, or head pose trajectories - Audio-visual synchronization: Lip-sync mismatches or timing inconsistencies between mouth and speech - Forensic analysis: Camera noise patterns, sensor artifacts, or lighting inconsistencies

However, detection is an "arms race": as detection techniques improve, generation techniques advance to evade them.

Synthetic media — broader category encompassing deepfakes plus AI-generated images, audio, and text
Disinformation — intentionally false content spread to deceive; deepfakes are one disinformation vector
Misinformation spread and diffusion — how deepfakes propagate through social networks
Trust in institutions and communicators — deepfakes' primary threat mechanism is erosion of trust through uncertainty
Political communication — political deepfakes and their electoral implications

Key resources¶

Jevin West — Misinformation and Data Literacy — discusses emerging synthetic-media threats (photoshopping, voice synthesis, deepfakes) and their role in escalating misinformation landscapes; notes public literacy lags behind technical sophistication

Key papers in this wiki¶

The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation — early threat analysis identifying deepfakes and synthetic media as AI-enabled political security threats
Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data — empirical taxonomy of real-world GenAI misuse; documents use of deepfakes in impersonation and falsification campaigns across audio, video, and image modalities
Disinformation 2.0 in the Age of AI: A Cybersecurity Perspective — perspective piece on AI-enabled disinformation 2.0; discusses deepfakes as a threat vector in attack scenarios and proposes detection mechanisms across network, device, and user layers
The Creation and Detection of Deepfakes: A Survey — comprehensive survey of both creation and detection methodologies; systematically reviews generative architectures (GANs, VAEs, CNNs, RNNs), technical approaches to reenactment/replacement/editing/synthesis, and detection methods (artifact-specific and undirected); identifies arms race dynamics and current limitations
DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection — comprehensive survey of face manipulation and fake detection covering synthesis techniques (GANs, StyleGAN, DeepFake), four manipulation types, and detection methods across public databases
Rana et al. (2022) — Deepfake Detection: A Systematic Literature Review — comprehensive SLR of 112 detection studies (2018–2020) categorizing 4 technique families (77% deep learning, 18% machine learning, 3% statistical, 2% blockchain); shows deep learning approaches achieve 89.7% mean accuracy vs. 85% for ML baselines; identifies FaceForensics++ as dominant benchmark and CNN as dominant architecture
DeepFakes: a New Threat to Face Recognition? Assessment and Detection — first public GAN-based Deepfake dataset (620 videos from 16 VidTIMIT subject pairs); demonstrates that state-of-the-art VGG and FaceNet systems achieve FAR of 85.62% and 95.00% on high-quality deepfakes; shows audio-visual lip-sync detection fails but image quality metrics achieve 8.97% EER
Rössler et al. (2019) — FaceForensics++: Learning to Detect Manipulated Facial Images — introduces largest facial forgery dataset (1.8M+ images from 1K+ videos) with four manipulation methods (Face2Face, FaceSwap, DeepFakes, NeuralTextures); comprehensive detection evaluation showing XceptionNet significantly outperforms humans; systematic evaluation of compression robustness
Dolhansky et al. (2020) — The DeepFake Detection Challenge (DFDC) Dataset — introduces the largest deepfake detection benchmark with 128,154 videos from 3,426 consenting actors; includes diverse face-swap generation methods; demonstrates via Kaggle competition that deepfake detection remains unsolved despite state-of-the-art methods
Vaccari & Chadwick (2020) — Deepfakes and Disinformation: Exploring the Impact of Synthetic Political Video on Deception, Uncertainty, and Trust in News — empirical study showing that deepfakes increase uncertainty and reduce trust in news on social media, even when they don't deceive people; educational interventions showing deepfakes' synthetic nature can reduce uncertainty
Fagni et al. (2020) — TweepFake: about detecting deepfake tweets — introduces TweepFake dataset of human vs. machine-generated tweets; benchmarks 13 detection methods; finds transformer-based models achieve 90% accuracy

Open challenges¶

How do we scale detection methods to social media platforms operating at billions of videos per day?
What interventions most effectively reduce both belief in and uncertainty about deepfakes?
How do deepfakes interact with existing political polarization and partisan trust asymmetries?
What are the long-term effects of deepfake exposure on institutional trust and civic participation?
How do non-English-language deepfakes spread differently than English-language ones?