AI video-generation startup Luma launched Luma Agents on Thursday. This new platform is designed to handle end-to-end creative work across text, image, video, and audio. The agents are powered by Luma’s Unified Intelligence family of models, which are built on a single multimodal reasoning system.
Luma is pitching these agents as a transformative tool for ad agencies, marketing teams, design studios, and enterprises. They are capable of planning and generating content across multiple formats while coordinating with other AI models. These include Luma’s own Ray 3.14, Google’s Veo 3 and Nano Banana Pro, ByteDance’s Seedream, and ElevenLabs’ voice models.
The foundation of Luma Agents is the startup’s Uni-1 model, the first in its Unified Intelligence family. According to Amit Jain, CEO and co-founder of Luma, Uni-1 has been trained on audio, video, image, language, and spatial reasoning. Jain describes the model as being able to “think in language and imagine and render in pixels or images,” a concept he calls “intelligence in pixels.” Capabilities for audio and video output will follow in later model releases.
Luma has begun rolling out the agentic platform with existing customers. These include global ad agencies Publicis Groupe and Serviceplan, as well as brands like Adidas, Mazda, and the Saudi AI company Humain. Jain states that their customers are not just buying a tool but are rethinking how business is done.
Jain positions Luma Agents as a game changer due to their ability to maintain persistent context across assets, collaborators, and creative iterations. The agents can evaluate and refine their own outputs, improving results through an iterative self-critique process. Jain compares this check-your-work capability to what has made coding agents so effective, noting the necessity of a loop that evaluates, fixes, and repeats until the solution is satisfactory.
He criticizes the current AI workflow in creative environments, which often requires learning to prompt numerous individual models. Instead, with Luma Agents, users do not need to prompt back and forth for each iteration. The system generates large sets of variations and allows users to steer the direction through conversation. Jain explains that because the Unified Intelligence models understand as well as generate, they can accomplish this end-to-end work.
He uses the analogy of a human architect designing a building. As the architect draws, they create an internal mental model of the structure, light, and spatial dynamics. Jain says Unified Intelligence is built on the same principle.
The system aims to significantly speed up creative workflows. In a demonstration, Jain showed how a 200-word brief and an image of a product, like a tube of lipstick, could lead the system to generate various ideas for an ad campaign’s locations, models, and color schemes.
In a practical example, Jain said Luma Agents transformed a brand’s $15 million, year-long ad campaign into multiple localized ads for different countries. This process was completed in 40 hours for under $20,000 and passed the brand’s internal quality and accuracy checks.
Luma Agents is now publicly available via API. Jain said the startup plans to roll out access gradually to ensure reliable service and avoid disruptions for users.

