Source reliability¶

Source reliability refers to the degree to which we can trust information published by a particular source—whether a news outlet, user, or website. Estimating source reliability is critical for both manual and automated fact-checking, as the credibility of evidence retrieved during verification depends on the trustworthiness of its source.

Levels of analysis¶

User-level: Assessing individual social media users or commenters based on their posting history, follower relationships, and interaction patterns. Includes detecting paid trolls, sockpuppets, and inauthentic accounts.

Source-level (media outlet): Profiling entire news organizations based on their publishing patterns, audience composition, and infrastructure. This is more scalable than checking individual claims and enables rapid detection of unreliable content.

Prediction methods¶

Source reliability can be inferred from:

Content analysis: Linguistic features, factual accuracy of past articles, tone and sentiment
Audience signals: Composition and ideology of followers/readers; who interacts with the outlet
Network features: Shared links, citations, and relationships with other outlets; co-citation patterns with known reliable sources
Infrastructure: Domain registration practices, hosting patterns, SSL certificate characteristics
Editorial patterns: Consistency of fact-checking labels, corrections published, update frequency

Challenges¶

Temporal drift: Source reliability changes over time; outlets may improve or degrade their practices
Issue-specific variation: An outlet may be reliable on some topics but biased on others
Ground truth scarcity: Limited manually-verified labels of source reliability; reliance on sources like Media Bias/Fact Check or AllSides
Class imbalance: Most outlets fall into moderate categories; extremes are rare

Key papers¶

Predicting Factuality of Reporting and Bias of News Media Sources: pioneering work on media-level factuality and bias prediction using multiple information sources (text, Wikipedia, Twitter, URLs, web traffic); introduces 1,066-website dataset
A Survey on Predicting the Factuality and the Bias of News Media: survey of source reliability prediction using textual, multimedia, audience, and infrastructure features

Media profiling: predicting factuality and bias of entire outlets
Factuality assessment: determining whether information is true or false
Fact-checking and corrections: manual or automated verification of claims, often informed by source credibility

Source reliability¶

Levels of analysis¶

Prediction methods¶

Challenges¶

Key papers¶

Related topics¶