Annotation bias¶
Systematic differences in how annotators label or interpret data, arising from their backgrounds, expertise, cultural perspectives, and demographic characteristics. Annotation bias critically influences the behavior of downstream models trained on labeled data: models inherit and amplify the biases of their training annotations. This is particularly acute in sensitive domains like hate speech detection, toxicity classification, and content moderation.
Key papers¶
- Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter — Demonstrates that amateur annotators label significantly more content as hate speech than experts; systems trained on expert annotations outperform those on crowdsourced data by ~5-8 F1 points
Related topics¶
- Data quality — broader issues with label noise and annotation reliability
- Crowdsourcing — economics and quality challenges of crowdsourced annotation
- Hate speech detection — domain where annotator expertise critically matters
- Demographic Bias — how annotator demographics influence labels