Factuality assessment¶

Factuality assessment involves determining whether a claim, article, or source is truthful. It can be performed at multiple granularities: claim-level (fact-checking individual assertions), article-level (fake news detection), or source-level (media outlet reliability prediction).

Levels of analysis¶

Claim-level fact-checking: Verifying specific factual assertions by retrieving evidence from reliable sources (Wikipedia, fact-checking websites, news articles, academic papers). Requires NLP for claim-evidence matching and truth judgment.

Article-level fake news detection: Classifying whether an entire article contains false information. Uses linguistic features (deceptive language, sensationalism), network signals (who shares it), and source credibility.

Source-level factuality prediction: Assessing whether a news outlet publishes reliable information based on its history. Enables rapid detection without analyzing every article; useful when the outlet has many published articles.

Challenges¶

Temporal variation: An outlet's factuality rating is not static. Sources may improve through corrections or degrade through changed editorial standards.

Ground truth scarcity: Limited labeled datasets; most work relies on manually-verified labels from sources like Media Bias/Fact Check, PolitiFact, or Snopes.

Annotation vs. automation gap: Human annotators judge factuality using criteria not always accessible to automated systems (e.g., external expert knowledge, domain-specific facts).

Issue-specificity: Outlets may be reliable on some topics but unreliable on others, making global factuality ratings imperfect.

Measurement bias: Outlet factuality is inferred from a sample of articles; non-uniform sampling across topics introduces bias in the estimated label.

Prediction approaches¶

Textual features: Linguistic markers of deceptiveness (hedging, negation, subjectivity); readability; claim-to-evidence language matching.

Multimedia signals: Image forensics (detection of manipulated or deepfaked images); reverse image search to check authenticity of photo sources.

Social signals: Propagation speed and reach; retweet patterns; crowd verification scores from crowdsourcing platforms.

Evidence retrieval: Matching article claims to fact-checked claims in databases; retrieving supporting evidence from Wikipedia, news, or scientific sources.

Source credibility: Estimating factuality from source reputation rather than article content.

Key papers¶

Predicting Factuality of Reporting and Bias of News Media Sources: source-level factuality prediction (low/mixed/high) using article text, Wikipedia, Twitter, URL structure, and web traffic; shows article features most predictive; introduces 1,066-website dataset
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media: ordinal regression for jointly predicting outlet trustworthiness (3-point scale) and political ideology (7-point scale); shows multi-task learning with auxiliary tasks reduces prediction error
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context: source-level factuality prediction via article text, YouTube features, and audience demographics; factuality harder than bias to predict
A Survey on Predicting the Factuality and the Bias of News Media: survey of source-level factuality prediction; shows factuality harder than bias to predict because it requires external ground truth
A Survey on Multimodal Disinformation Detection: multimodal approaches to factuality detection covering text, images, video, and network signals

Fact-checking and corrections: systematic verification of claims, often informed by source factuality
Source reliability: assessing whether a source is trustworthy; factuality is one dimension
Media profiling: predicting both factuality and bias of news outlets
Rumour Verification: determining truth of social media claims

Factuality assessment¶

Levels of analysis¶

Challenges¶

Prediction approaches¶

Key papers¶

Related topics¶