Machine Translation¶
Machine translation (MT) systems automatically convert text from a source language to a target language. Neural machine translation (NMT)—based on sequence-to-sequence architectures with attention and transformers—has dramatically improved over statistical machine translation, achieving fluent outputs that often rival human translation on many language pairs.
Hallucination in neural machine translation¶
NMT systems exhibit hallucinations, though the phenomenon manifests differently than in other NLG tasks. Common hallucination types:
- Untranslated sequences: Model generates repetitions or copies of source-language words in the target output
- Back-translation errors: Model generates target sequences that don't correspond to any part of the source
- Alignment failures: Incorrect word-order or phrase-structure alignments that produce ungrammatical or nonsensical translations
Hallucinations often stem from dataset noise, training/inference mismatch (exposure bias), and implicit coverage constraints in beam search decoding.
Evaluation¶
- Automatic metrics: BLEU, METEOR, TER score overall translation quality but don't specifically measure hallucination
- Human evaluation: Fluency and adequacy judgments; hallucinations reduce adequacy
- Alignment-based metrics: Check whether translated sequences align to source phrases
Mitigation¶
- Data cleaning: Remove low-quality, misaligned, or noisy parallel data
- Architecture improvements: Explicit coverage mechanisms, fertility models
- Training methods: Diversified training, back-translation, knowledge distillation
- Inference: Constrained decoding with coverage penalties
Key papers¶
- Survey of Hallucination in Natural Language Generation — Section 11 surveys hallucination definitions, metrics, and mitigation approaches in neural machine translation
Related topics¶
- Natural Language Generation — broader task
- Hallucination in language models — cross-task phenomenon
- Neural language models — transformer-based translation models
- Sequence To Sequence Models — foundational architecture