Content Moderation

Rater Cohesion and Quality from a Vicarious Perspective

Asking people to predict how others with different political views would label content reveals hidden biases and improves data quality for content moderation AI.

Deepak Pandita
Read more
Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive featured image

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

We ran a massive experiment: 9 different AI content moderation systems analyzed 92 million YouTube comments about US politics. The results were shocking—different AI systems …

Tharindu Cyril Weerasooriya
Read more