Computational fact checking¶
Computational fact checking encompasses automated approaches to assess the truthfulness of factual claims. Unlike manual fact-checking by journalists and expert annotators, computational methods aim to scale verification to the volume of claims produced online. Approaches range from knowledge-graph based methods that reason over structured knowledge bases, to NLP-based evidence retrieval systems that locate and rank supporting documents, to network-based credibility assessment.
Approaches¶
Knowledge-graph based fact checking: Reasoning over structured knowledge graphs (Wikipedia, DBpedia, Freebase) to assess claims by analyzing connectivity patterns and entity relationships. Encodes the assumption that factually correct statements should be expressible as paths in the knowledge graph.
Evidence-based fact verification: Identifying supporting or refuting evidence from text corpora, web documents, or existing knowledge bases. Often paired with NLP for claim decomposition and retrieval.
Source credibility assessment: Evaluating the historical accuracy and reliability of information sources as a proxy for individual claim verification.
Multimodal verification: For claims involving images, video, or audio: reverse image search, manipulation detection, caption-image consistency checking, and metadata analysis.
Key papers in this wiki¶
- Quelle & Bovet (2023) — The Perils & Promises of Fact-checking with Large Language Models — Evaluates GPT-3.5 and GPT-4 for automated fact-checking on PolitiFact and multilingual Data Commons datasets; shows GPT-4 outperforms GPT-3.5 and contextual information via Google Search substantially improves accuracy (10–20 percentage points); reveals critical language-specific disparities with non-English claims underperforming when compared to English translations
- Augenstein et al. (2019) — MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims — Large-scale real-world fact verification dataset (34,918 claims from 26 fact-checking websites) with evidence retrieval and multi-task learning for cross-domain veracity prediction
- Computational fact checking from knowledge networks (2015) — Frames fact checking as shortest-path computation in knowledge graphs; achieves high accuracy using only structural features
- FEVER: A large-scale dataset for fact extraction and verification (2018) — Benchmark dataset and shared task for evidence-based fact verification
- LIAR: A Benchmark Dataset for Fake News Detection (2017) — Dataset of labeled political claims with fact-check labels and metadata
Related topics¶
- Misinformation and fake news detection — Broader category encompassing detection methods
- Knowledge graphs — Structured knowledge bases used in graph-based fact checking
- Fact-checking and corrections — Manual and automated approaches to verification
- Evidence Based Reasoning — Using evidence and reasoning to validate claims