Generative AI Misuse¶
Generative AI misuse refers to the deliberate use of generative AI tools and models by individuals and organizations to facilitate, augment, or execute actions that may cause downstream harm. This includes both exploitation of GenAI capabilities (e.g., creating deepfakes, generating fraudulent content, impersonating individuals) and technical attacks on GenAI systems themselves (e.g., prompt injection, model poisoning).
Scope¶
The landscape of GenAI misuse encompasses:
- Capability exploitation: leveraging text, image, audio, and video generation to create synthetic content, impersonate individuals, falsify evidence, or scale harmful operations
- System attacks: adversarial inputs, prompt injection, jailbreaking, model extraction, and data poisoning
- Motivation-driven applications: opinion manipulation, monetization, fraud, harassment, and reach/advocacy
- Actor diversity: from state-sponsored entities to private corporations to individual users, with varying technical sophistication
Early empirical evidence (2023–2024) suggests that most real-world misuse does not involve sophisticated technical attacks but rather exploitation of easily accessible GenAI capabilities for goals long predating GenAI (impersonation, forgery, scams). The democratization of GenAI tools has lowered barriers to entry, enabling a broader pool of actors to engage in misuse with minimal technical expertise.
Key papers¶
- On the Opportunities and Risks of Foundation Models — Comprehensive analysis of foundation models' misuse risks including generation of synthetic misinformation, deepfakes, fake profiles, and personalized manipulative content; examines both generative capabilities and detection defenses
- Red Teaming Language Models with Language Models — Demonstrates automated discovery of harmful model behaviors including offensive generation and data leakage; reveals how LLMs can be misused to generate harmful content at scale
- Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data — first comprehensive taxonomy of real-world GenAI misuse tactics based on 191 documented incidents
Related topics¶
- Deepfakes (specific modality and technique)
- Synthetic media (generated content as evidence falsification)
- Online Manipulation (uses GenAI to distort public opinion)
- Fraud Detection (GenAI-enabled identity fraud and scams)
- Misinformation and fake news detection (detection of AI-generated disinformation)