Skip to content

Crowdsourcing

Crowdsourcing in NLP refers to collecting human annotations and judgments from geographically distributed, often non-expert workers via platforms like Amazon Mechanical Turk. It enables large-scale evaluation but introduces challenges around quality control, inter-annotator agreement, and demographic representation.

Typical applications

  • Data annotation: Labeling datasets for training (e.g., stance, sentiment, factuality)
  • Evaluation: Collecting human judgments of system outputs (e.g., translation quality, generated text naturalness)
  • Quality control: Detecting low-quality crowdworkers and aggregating noisy judgments

Key papers