The Facebook insider building content moderation for the AI era

When Brett Levenson left Apple for Facebook in 2019, he thought he could fix content moderation with better tech. He quickly learned the problem was deeper. Human reviewers had only 30 seconds per piece of content to apply a poorly translated policy, with decisions only slightly better than a coin flip. This reactive model fails against today’s adversarial actors and AI chatbots, which have caused high-profile incidents like providing self-harm guidance.

This frustration led Levenson to found Moonbounce, which just raised $12 million. The company turns policy documents into executable code. Its system evaluates content in under 300 milliseconds, acting to block or slow distribution. It serves user-generated content platforms, AI companion companies, and image generators, handling over 40 million daily reviews for clients like Civitai and Channel AI.

AI companies face growing legal pressure as internal safety fails. Moonbounce acts as a neutral third-party enforcer at runtime, not bogged down by chat history. The team is developing “iterative steering” to redirect harmful conversations toward supportive responses, rather than blunt refusals. While acquisition by a giant like Meta might make sense, Levenson fears that would restrict the technology’s broader benefit.