Veracity prediction¶

Veracity prediction is the task of automatically determining whether a claim or statement is true, false, or unverifiable based on available evidence and external knowledge.

Problem formulation¶

Given a claim (e.g., a tweet, news headline, or statement), predict:

True: The claim is factually accurate
False: The claim is factually incorrect
Unverifiable/Unknown: The claim cannot be verified with available information

Systems may also return confidence scores (0–1) indicating certainty.

Approaches¶

Text-based (closed variant)¶

Predict veracity using only the claim text itself: - Linguistic cues (hedging, certainty markers, temporal language) - Lexical patterns associated with misinformation - Pre-trained language models with fine-tuning

Context-augmented (open variant)¶

Use additional external information: - Wikipedia articles and knowledge bases - Archived web content and linked URLs - Community responses and stance labels - Temporal metadata and event context

Community-informed¶

Leverage collective intelligence: - Aggregating community stance (support/deny/query) to infer veracity - User credibility and comment patterns - Conversational signals and debate outcomes

Challenges¶

AI-hard problem: Veracity often requires domain expertise, event knowledge, and real-world reasoning
Bias and subjectivity: Determining ground truth for controversial claims is difficult
Temporal sensitivity: Claims may be true in one time period and false in another
Information gaps: Necessary evidence may not be available at prediction time
Class imbalance: False, true, and unverifiable claims have different distributions

Evaluation metrics¶

Accuracy: Ratio of correct predictions (simple baseline)
Macro-averaged accuracy: Average per-class accuracy (addresses imbalance)
Confidence-aware metrics: RMSE of predicted vs. reference confidence scores
F1 / Precision / Recall: Per-class performance metrics

Key papers and benchmarks¶

Augenstein et al. (2019) — MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims — 34,918 real-world claims from 26 fact-checking domains with evidence retrieval and entity linking; multi-task learning for veracity prediction across domains with heterogeneous label spaces; best model achieves Macro F1 of 49.2%
SemEval-2017 Task 8: RumourEval — shared task with closed and open variants