Evidence Extraction¶

Evidence extraction is the task of automatically identifying and retrieving the most relevant text snippets or sentences from a document collection that either support or refute a given claim. This is a core subtask in the fact-checking pipeline, bridging document retrieval and claim verification.

Problem formulation¶

Given: - A claim (string) - A collection of documents or sentences

Find: - The subset of sentences/passages that provide evidence for or against the claim - A ranking or relevance score for each piece of evidence

Evidence extraction can be formulated as: 1. Classification: labeling sentences as supporting, refuting, or irrelevant 2. Ranking: ordering sentences by their relevance for validating the claim 3. Span extraction: identifying the minimal text spans containing essential evidence

Challenges¶

Relevance vs. similarity: Lexically similar sentences may not be relevant evidence (e.g., mentioning the same topics without addressing the claim)
Multi-hop reasoning: Evidence sometimes requires combining information across multiple sentences
Source reliability: In heterogeneous document collections, unreliable sources may return false "evidence"
Granularity: Determining the right unit (word span, sentence, paragraph, document) for evidence
Fine-grained evidence: Annotating which parts of a sentence are actually evidence vs. background

Key papers¶

Hanselowski et al. (2019) — A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking — fine-grained evidence extraction with detailed sentence-level annotations; compared ranking (FEVER-style) and classification approaches; best models (BilSTM, rankingESIM) achieve recall@5 of 0.637 and 0.507 respectively; identifies challenges of paraphrased evidence and topic overlap without relevance
Thorne et al. (2018) — FEVER: A Large-Scale Dataset for Fact Extraction and VERification — pipeline combining document retrieval, sentence selection, and textual entailment; evidence is Wikipedia sentences with document-level evidence supervision
Thorne et al. (2018) — The Fact Extraction and VERification (FEVER) Shared Task — shared task with emphasis on sentence selection as evidence retrieval

Evidence Extraction¶

Problem formulation¶

Challenges¶

Key papers¶

See also¶