Generative AI Misuse¶

Generative AI misuse refers to the deliberate use of generative AI tools and models by individuals and organizations to facilitate, augment, or execute actions that may cause downstream harm. This includes both exploitation of GenAI capabilities (e.g., creating deepfakes, generating fraudulent content, impersonating individuals) and technical attacks on GenAI systems themselves (e.g., prompt injection, model poisoning).

Scope¶

The landscape of GenAI misuse encompasses:

Capability exploitation: leveraging text, image, audio, and video generation to create synthetic content, impersonate individuals, falsify evidence, or scale harmful operations
System attacks: adversarial inputs, prompt injection, jailbreaking, model extraction, and data poisoning
Motivation-driven applications: opinion manipulation, monetization, fraud, harassment, and reach/advocacy
Actor diversity: from state-sponsored entities to private corporations to individual users, with varying technical sophistication

Early empirical evidence (2023–2024) suggests that most real-world misuse does not involve sophisticated technical attacks but rather exploitation of easily accessible GenAI capabilities for goals long predating GenAI (impersonation, forgery, scams). The democratization of GenAI tools has lowered barriers to entry, enabling a broader pool of actors to engage in misuse with minimal technical expertise.

Key papers¶

On the Opportunities and Risks of Foundation Models — Comprehensive analysis of foundation models' misuse risks including generation of synthetic misinformation, deepfakes, fake profiles, and personalized manipulative content; examines both generative capabilities and detection defenses
Red Teaming Language Models with Language Models — Demonstrates automated discovery of harmful model behaviors including offensive generation and data leakage; reveals how LLMs can be misused to generate harmful content at scale
Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data — first comprehensive taxonomy of real-world GenAI misuse tactics based on 191 documented incidents

Deepfakes (specific modality and technique)
Synthetic media (generated content as evidence falsification)
Online Manipulation (uses GenAI to distort public opinion)
Fraud Detection (GenAI-enabled identity fraud and scams)
Misinformation and fake news detection (detection of AI-generated disinformation)

Generative AI Misuse¶

Scope¶

Key papers¶

Related topics¶