Skip to content

Annotation bias

Systematic differences in how annotators label or interpret data, arising from their backgrounds, expertise, cultural perspectives, and demographic characteristics. Annotation bias critically influences the behavior of downstream models trained on labeled data: models inherit and amplify the biases of their training annotations. This is particularly acute in sensitive domains like hate speech detection, toxicity classification, and content moderation.

Key papers