Google’s Cloud AI lead on the three frontiers of model capability

Michael Gerstenhaber, a product VP at Google Cloud, focuses primarily on Vertex, the company’s unified platform for deploying enterprise AI. This role provides him a high-level perspective on how companies are actually using AI models and what remains to be done to unlock the full potential of agentic AI.

In a conversation with Michael, one idea stood out as particularly novel. He explained that AI models are simultaneously advancing against three distinct frontiers. The first is raw intelligence. The second is response time. The third is a quality less about capability and more about economics: whether a model can be deployed cheaply enough to operate at massive, unpredictable scale. This framework offers a valuable new way to think about model capabilities, especially for those aiming to push frontier models in new directions.

When asked about his background and role, Michael described his journey. He has been in AI for about two years, spending a year and a half at Anthropic before joining Google almost half a year ago. At Google, he runs Vertex, the developer platform. His team serves engineers who are building their own applications, providing access to agentic patterns, an agentic platform, and inference from the world’s smartest models. He emphasized that his role is to supply the platform, while customers like Shopify and Thomson Reuters build the applications for their specific domains.

Michael was drawn to Google because of its unique vertical integration. The company operates across the entire stack, from building data centers and power plants to designing its own chips and models. It controls the inference layer, the agentic layer, and offers APIs for functions like memory and interleaved code writing. It also has agent engines for compliance and governance, and consumer-facing interfaces like Gemini. This comprehensive control, from infrastructure to application, represented a key strength in his view.

Regarding the competitive landscape, where major AI labs appear closely matched in capabilities, Michael sees it as more complex than a simple race for intelligence. He identifies three critical boundaries. Models like Gemini Pro are tuned for raw intelligence, ideal for tasks like code generation where quality is paramount and latency is less critical.

A second boundary is latency. For use cases like customer support, where an agent needs to apply a policy in real-time, intelligence must be delivered within a strict latency budget. The perfect answer is useless if it arrives after the customer has hung up.

The third boundary is cost at scale. Companies like Reddit or Meta, which need to moderate vast, unpredictable volumes of content, require models that are not only intelligent but also affordable enough to deploy at essentially infinite scale. For them, cost becomes the paramount factor in model selection.

Addressing the slower-than-expected adoption of agentic AI systems, Michael pointed to the technology’s nascent state. While the models themselves are impressive, the necessary supporting infrastructure is still developing. Patterns for auditing agent actions, authorizing data access for agents, and other production requirements are still being established. Production readiness always lags behind technological capability.

He noted that adoption has been uniquely rapid in software engineering because it fits neatly into existing development lifecycles. Safe dev environments, promotion pipelines, and human-in-the-loop processes like code reviews make implementation low-risk. The challenge is to develop and institutionalize similar safe patterns for other professions and use cases.