Safety Snapshot: March
Full-Scene Moderation, a New Creator Dashboard, and Training for Community Managers
In our March Safety Snapshot, we’re sharing how we moderate content through automated systems designed to detect and take down problematic content—often before users see it. We’re also sharing a new dashboard empowering creators to spot and address bad user behavior in their own games, and a new program to train community managers across the industry. See last month’s Safety Snapshot for more on user reporting tools.
Continuously Moderating User Behavior
Part of the magic of Roblox is the ever-expanding and changing content produced by our creators. At any moment, creators are publishing game updates and adding new badges or pets, and users are changing their avatars’ outfits and painting or building things in real time. We review all of this before it’s published to the platform and remove anything that’s against our guidelines.
Multiple layers of moderation tools catch the vast majority of problematic content on Roblox, and we reject anything that goes against our Community Standards. But we’re not perfect, so we leverage user reports to help us find anything we may have missed and take action accordingly.
The dynamic nature of Roblox games means content changes continually based on how users combine previously approved avatars, clothing, and movements. For example, in games that allow free-form drawing, a user could draw an offensive symbol or item. That’s why we recently launched a new AI system for real-time multimodal moderation that can scan these combinations together.
Traditional AI moderation systems are designed to evaluate one object at a time and often lack context, missing combinations that could be problematic in ways that the individual items aren’t. Our new real-time multimodal moderation system evaluates an entire scene—including avatars, text, and 3D objects. It captures all of these elements together in a specific moment and assesses whether the full scene breaks our rules. If this type of problematic behavior repeatedly occurs in a single game instance, the system will shut down just that instance (also called a server), rather than the entire game.
Since launching this multimodal system, we’ve shut down approximately 5,000 instances per day that violate our Community Standards. As we train and scale, we’re constantly improving our accuracy and working with the community to minimize false positives. We’re working to scale this multimodal system to capture and monitor 100% of playtime. But there will always be individuals working to circumvent any system, so we’re actively developing technology that goes beyond shutting down servers. We’re working on ways to identify specific bad actors so we can remove them without disrupting the experience for well-intentioned players.
Giving Creators Visibility Into Server Shutdowns
We’re providing creators with greater transparency into the results of this multimodal moderation system with an addition to the safety overview dashboard. As noted above, we shut down game servers when they’re overtaken by bad user behavior. To give creators more visibility into how often this is happening in their games, we’ve added a new chart to their existing Creator Dashboard.
Creators can now see how many of their game servers have been shut down for bad user behavior (i.e., breaks our harassment and discrimination or romantic and sexual content policies). This helps them identify a sudden increase so they can act before shutdowns affect their broader community. They can then take a closer look at their game and decide whether changes are needed to custom emotes, avatar editing tools, or in-game user creation features to help prevent problematic creations.
Training Digital Moderators
Roblox, Keyword Studios, and Riot Games are partnering with research psychologist and Games for Change Research Director Rachel Kowert, on a new certification program for digital community leaders. Roblox will contribute expertise on community moderation and prosocial design to pilot and help shape the curriculum for the new DLC Leadership Program. The goal of the initiative is to address a lack of standardized training for online moderators, community managers, and creators in the gaming space. In a recent article, Kowert said the program aims to “translate research on gaming communities and online behavior into practical tools that digital leaders can use to build more resilient and sustainable online communities.”
By participating in this effort, we’ll help lead the industry, developing a first-of-its-kind standardized certification program designed to benefit the Roblox creator community and the broader gaming industry. Once completed, the program will help creators learn the critical skills required to effectively moderate and manage their own growing communities. We believe that training more moderators and creators on the best practices for healthy, respectful online communities will help keep online gaming more positive for everyone.