Disinformation Generation¶
Disinformation generation refers to the use of computational methods—particularly large language models—to automatically create false, misleading, or manipulative content at scale. This represents a significant threat to information ecosystems, as LLMs can generate plausible news articles, social media posts, and other content that reinforces harmful narratives (health misinformation, election interference, propaganda) while evading human detection.
The challenge differs from historical misinformation research (which focused on human or bot-amplified falsehoods) in that the content itself is machine-generated and can be customized to specific narratives via prompting.
Key papers¶
- Disinformation Capabilities of Large Language Models — comprehensive evaluation showing most LLMs readily generate disinformation; only Falcon exhibits consistent refusal; existing detection methods achieve ~0.8 F1
- Zellers et al. (2021) — GROVER: A State-of-the-Art Open-Source Neural Fake News Generator — pioneering work on neural fake news generation with GPT-2-style models
- Combating Misinformation in the Age of LLMs: Opportunities and Challenges — systematic review of misinformation combating in the LLM era
- Can LLM-Generated Misinformation Be Detected? — shows LLM-generated misinformation is harder for humans and detectors to identify than human-written falsehoods
Related topics¶
- Large Language Models — the primary technology enabling disinformation generation at scale
- AI Safety — safety mechanisms and governance to prevent intentional harmful generation
- Fake news detection methods — detecting whether content is machine- or human-generated
- Fake news — broader category of intentionally false content