Hallucination Mitigation¶
Hallucinations are outputs where language models generate plausible-sounding but false or fabricated information. Mitigation approaches include knowledge editing to correct outdated facts, continual learning to maintain alignment with current knowledge, retrieval-augmentation to ground generation in external sources, and training-time methods to improve factual consistency.
Key papers¶
- Zhang et al. (2023) — Survey of knowledge updating methods that mitigate hallucinations from outdated knowledge
- Ji et al. (2022) — Comprehensive survey of hallucination phenomena in neural language generation, including taxonomy of causes and mitigation strategies
Related topics¶
Notes¶
Knowledge-based hallucinations stem from models generating outdated or incorrect facts learned from training data. Systems combining knowledge updating (editing, continual learning) with retrieval-augmentation show the most promise for reducing hallucinations in deployed settings.