Skip to content

Multi-document summarization

Multi-document summarization (MDS) aims to synthesize information from multiple source documents into a coherent summary that captures the essential content while avoiding redundancy. It is more challenging than single-document summarization because systems must handle conflicting information, track multiple perspectives, and make rhetorical choices about which details to include.

Key challenges

Information fusion: Determining which information from multiple sources to include and how to integrate conflicting claims or evidence.

Redundancy reduction: Avoiding the verbatim or near-duplicate repetition of content across source documents.

Coherence and structure: Producing fluent, well-organized summaries that don't read as disconnected snippets from different documents.

Evidence synthesis: For specialized domains like medicine or law, correctly synthesizing evidence from multiple studies or sources while maintaining factual accuracy.

Application domains

News summarization: Synthesizing coverage of events from multiple news outlets.

Medical literature review: Automatically summarizing evidence from multiple studies on a clinical question (e.g., Cochrane reviews).

Scientific article summarization: Aggregating findings across multiple papers on a research topic.

Legal document analysis: Summarizing claims, evidence, and decisions across multiple case documents.

Key papers

  • [[2023-wang-medical-summarization-metrics]] — evaluates automated metrics for medical MDS and finds significant disagreement with human judgments