Skip to content

Neural Machine Translation

Neural machine translation (NMT) systems use neural networks—primarily encoder-decoder architectures with attention mechanisms and transformers—to directly translate between languages without intermediate hand-crafted linguistic representations. Unlike statistical phrase-based approaches, NMT models learn end-to-end mappings from source to target sequences, enabling more fluent and contextually coherent translations.

Modern NMT systems, particularly the transformer family, have achieved state-of-the-art performance on many language pairs and enabled practical deployment across industry. Massively multilingual variants (e.g., M2M models with 100+ language directions) extend NMT to low-resource pairs by leveraging shared representations across languages. However, these systems exhibit failure modes—hallucinations, undertranslation, and bias—especially in low-resource settings.

Key papers

  • [[2023-guerreiro-hallucinations-multilingual]] — comprehensive study of hallucinations across multilingual translation models