Skip to content

Media profiling

Media profiling involves assessing the factuality and bias of entire news outlets rather than evaluating individual claims or articles. The key insight is that outlets with consistent records of publishing false or heavily biased content are likely to continue doing so, enabling rapid detection of potentially unreliable content the moment it is published by simply checking the source.

Rationale

Manual fact-checking of every suspicious claim is infeasible at scale. Viral misinformation spreads 6× faster than true news, with over 50% of sharing happening within the first ten minutes. Source-level profiling allows detection without waiting for evidence accumulation, making it especially valuable for time-sensitive contexts.

Approaches

Textual features: Linguistic markers (sentiment, hedging, subjectivity) computed over articles published by a medium; averaged embeddings from BERT or Sentence-BERT; posterior probability aggregation.

Multimedia analysis: Visual characteristics of images (deep learning representations, reverse image search provenance); video and audio forensics for deepfakes and manipulated media.

Audience homophily: Ideological profiles of followers on Twitter, Facebook, or YouTube; audience distribution across the political spectrum; self-descriptions of users following media accounts.

Infrastructure characteristics: Domain registration patterns, DNS/certificate metadata, web design features, shared data objects (scripts, images) across websites. Content-agnostic and useful for nascent outlets with limited published material.

Key papers