Claim Verification¶
Claim verification (or automated fact-checking) is the task of determining whether a factual claim is true, false, or indeterminate given available evidence. Unlike manual fact-checking by journalists, automated claim verification aims to scale verification through natural language processing and machine learning. The typical pipeline involves: (1) evidence retrieval—finding relevant documents or passages that could support or refute the claim, (2) evidence selection—identifying specific sentences or passages, and (3) veracity classification—determining whether the claim is supported, refuted, or cannot be determined from the evidence.
Key papers¶
- Quelle & Bovet (2023) — The Perils & Promises of Fact-checking with Large Language Models — Evaluates GPT-3.5 and GPT-4 for claim verification on PolitiFact and multilingual Data Commons; demonstrates ReAct agent framework combining LLM reasoning with iterative Google Search retrieval; shows context retrieval improves accuracy by 10–20 percentage points and reveals training-data bias limiting non-English performance
- Wang & Shu (2023) — Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models — proposes FOLK, using first-order logic to decompose claims and LLMs to retrieve grounded answers; achieves state-of-the-art on HoVER (54.80% F1 on 3-hop), FEVEROUS (67.01% F1 on multi-hop), and SciFactOpen (67.59% F1); generates human-readable explanations with high coverage and readability
- Augenstein et al. (2019) — MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims — dataset of 34,918 naturally occurring claims from 26 fact-checking websites with entity linking and evidence retrieval; multi-task learning approach jointly predicts veracity and ranks evidence pages; best model achieves Macro F1 of 49.2%, demonstrating challenge of multi-domain heterogeneity
- Thorne et al. (2018) — FEVER: A Large-Scale Dataset for Fact Extraction and VERification — introduces FEVER dataset and pipeline approach; 185,445 claims verified against Wikipedia with evidence sentences; baseline system combines document retrieval, sentence selection, and natural language inference achieving 31.87% accuracy with correct evidence requirement
- Jin et al. (2021) — Towards Fine-Grained Reasoning for Fake News Detection: Performs claim-level verification by constructing claim-evidence graphs from social media and using graph neural networks to reason over evidence at fine granularity. Ranks evidence by importance using mutual-reinforcement mechanisms that integrate human knowledge; achieves 91.7% F1 on PolitiFact by modeling which evidence groups support or refute claims.
Related topics¶
- Fact-checking and corrections — manual and automated approaches to verification
- Natural Language Inference — reasoning over evidence to classify claim veracity
- Information Retrieval — retrieving evidence documents for verification