Selection Bias¶
Selection bias occurs when the data we observe does not represent the true underlying process due to non-random exposure, self-selection, or platform filtering. In social media, selection bias is ubiquitous: we only observe posts users chose to engage with or that algorithms chose to show them.
In the context of fake news, selection bias creates a measurement problem: if a user did not share a piece of fake news, we cannot distinguish between "the user was not interested" and "the user was not exposed." This missing-not-at-random (MNAR) pattern biases estimators of user behavior if left unaddressed.
Standard approaches to mitigating selection bias in observational studies include inverse propensity scoring, which reweights observed data to approximate a randomized experiment.
Key papers¶
- Causal Understanding of Fake News Dissemination on Social Media — applies inverse propensity scoring to mitigate selection bias in fake news dissemination models
Related topics¶
- Causal Inference (methods to address bias)
- Confounding (unmeasured causes of bias)