Skip to content

Content moderation

Content moderation encompasses policies, enforcement mechanisms, and governance structures platforms use to manage harmful, false, or violative content. This includes both moderation of human-generated content and, increasingly, governance of AI-generated content and AI-assisted moderation tools.

Key approaches

Detection-based: Identifying and removing harmful content through human review, crowdsourcing, or algorithmic detection

Prevention-based: Structural design choices that reduce the spread of false content (e.g., algorithmic transparency, friction, diversification)

Governance-based: Establishing policies, appeals processes, and external review mechanisms

Key papers

Platform governance and regulation

AI-generated content detection and mitigation

Offensive and abusive content detection

See also