Skip to content

Hallucination Mitigation

Hallucinations are outputs where language models generate plausible-sounding but false or fabricated information. Mitigation approaches include knowledge editing to correct outdated facts, continual learning to maintain alignment with current knowledge, retrieval-augmentation to ground generation in external sources, and training-time methods to improve factual consistency.

Key papers

  • Zhang et al. (2023) — Survey of knowledge updating methods that mitigate hallucinations from outdated knowledge
  • Ji et al. (2022) — Comprehensive survey of hallucination phenomena in neural language generation, including taxonomy of causes and mitigation strategies

Notes

Knowledge-based hallucinations stem from models generating outdated or incorrect facts learned from training data. Systems combining knowledge updating (editing, continual learning) with retrieval-augmentation show the most promise for reducing hallucinations in deployed settings.