Multi-document summarization¶
Multi-document summarization (MDS) aims to synthesize information from multiple source documents into a coherent summary that captures the essential content while avoiding redundancy. It is more challenging than single-document summarization because systems must handle conflicting information, track multiple perspectives, and make rhetorical choices about which details to include.
Key challenges¶
Information fusion: Determining which information from multiple sources to include and how to integrate conflicting claims or evidence.
Redundancy reduction: Avoiding the verbatim or near-duplicate repetition of content across source documents.
Coherence and structure: Producing fluent, well-organized summaries that don't read as disconnected snippets from different documents.
Evidence synthesis: For specialized domains like medicine or law, correctly synthesizing evidence from multiple studies or sources while maintaining factual accuracy.
Application domains¶
News summarization: Synthesizing coverage of events from multiple news outlets.
Medical literature review: Automatically summarizing evidence from multiple studies on a clinical question (e.g., Cochrane reviews).
Scientific article summarization: Aggregating findings across multiple papers on a research topic.
Legal document analysis: Summarizing claims, evidence, and decisions across multiple case documents.
Key papers¶
- [[2023-wang-medical-summarization-metrics]] — evaluates automated metrics for medical MDS and finds significant disagreement with human judgments