Abstract
For the past decade, content moderation of online platforms has been dominated by industrial scale human labor to review content at an ever-increasing scale to determine violations of platforms’ rules and guidelines. In recent years, platforms have rapidly increased the use of automation, machine learning, and artificial intelligence (AI), at first to help drive efficiency through prioritization of review, but increasingly content moderation decisions are made exclusively by AI. The use of AI for content moderation has met with considerable pushback from users, mainly stemming from different perceptions about AI systems, their functioning, and their role in the curation process. During this same time period, psychology and HCI researchers have been demonstrating that content moderation decisions made by AI are viewed as significantly less trustworthy than decisions by humans, be it a jury, community moderator, or platform employee. This trust deficit created by increasing dependence on AI for content moderation is a problem for the field of “Trust and Safety” given the already low levels of trust and legitimacy that people have for these platform actions. In this study, we use a survey-based experiment to explore opportunities for gaining ground in this AI trust deficit. We test three interventions ability to build trust within the content moderation experience that make use of unique advantages that LLMs have to help more thoroughly explain violations, assist in preparing moderation decision appeals, and collecting people’s feedback on a platform’s policies.
Presenters
Caroline NoboResearch Scholar in Law & Executive Director, Yale Law School, Justice Collaboratory, Connecticut, United States
Details
Presentation Type
Paper Presentation in a Themed Session
Theme
KEYWORDS
Content Moderation, Trust, Legitimacy, Online Governance, Social Media, Safety