Skip to content

Abstractive Summarization

Abstractive summarization systems read source documents and generate shorter, human-readable summaries that capture the essential information. Unlike extractive summarization (which selects existing sentences), abstractive systems must paraphrase, compress, and reorganize content—tasks that require deep language understanding and generation.

Architectures

Modern abstractive summarization uses transformer-based sequence-to-sequence models (BART, T5, PEGASUS) trained on large paired datasets of documents and summaries. These models encode the source document and decode a summary token-by-token, maximizing the likelihood of reference summaries.

Hallucination problem

Abstractive summarization is particularly vulnerable to hallucination—generating plausible but factually incorrect or unsupported claims not present in the source document. This is critical for high-stakes applications (medical, legal, news summarization) where false information can cause harm.

Intrinsic hallucinations: Summary contradicts the source document (e.g., claiming a vaccine was approved in 2021 when the source says 2019).

Extrinsic hallucinations: Summary includes factually correct but unsupported information not mentioned in the source (e.g., adding background knowledge about a person not discussed in the article).

Evaluation and detection

Key metrics for measuring hallucination in abstractive summarization: - Information Extraction (IE)-based: Extract entities and relations from source and summary; compare overlap - Natural Language Inference (NLI): Check whether summary is entailed by the source - QA-based: Generate questions from the summary and check if the source answers them consistently - Human evaluation: Crowdsourced judgment of faithfulness

Mitigation strategies

  • Architecture methods: Modify encoders/decoders to enforce grounding (e.g., adding explicit retrieval of supporting phrases from source)
  • Training methods: Contrastive learning, reinforcement learning with factuality-aware rewards, joint training on entailment
  • Post-processing: Filter or correct hallucinated spans using learned correction models or fact-checking systems

Key papers