Gaining Ground in the AI Content Moderation Trust Deficit

Abstract

For the past decade, content moderation of online platforms has been dominated by industrial scale human labor to review content at an ever-increasing scale to determine violations of platforms’ rules and guidelines. In recent years, platforms have rapidly increased the use of automation, machine learning, and artificial intelligence (AI), at first to help drive efficiency through prioritization of review, but increasingly content moderation decisions are made exclusively by AI. The use of AI for content moderation has met with considerable pushback from users, mainly stemming from different perceptions about AI systems, their functioning, and their role in the curation process. During this same time period, psychology and HCI researchers have been demonstrating that content moderation decisions made by AI are viewed as significantly less trustworthy than decisions by humans, be it a jury, community moderator, or platform employee. This trust deficit created by increasing dependence on AI for content moderation is a problem for the field of “Trust and Safety” given the already low levels of trust and legitimacy that people have for these platform actions. In this study, we use a survey-based experiment to explore opportunities for gaining ground in this AI trust deficit. We test three interventions ability to build trust within the content moderation experience that make use of unique advantages that LLMs have to help more thoroughly explain violations, assist in preparing moderation decision appeals, and collecting people’s feedback on a platform’s policies.

Presenters

Caroline Nobo
Research Scholar in Law & Executive Director, Yale Law School, Justice Collaboratory, Connecticut, United States

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

Social Realities

KEYWORDS

Content Moderation, Trust, Legitimacy, Online Governance, Social Media, Safety