Nvidia unveils new GPU designed for long-context inference

At the AI Infrastructure Summit on Tuesday, Nvidia announced a new GPU called the Rubin CPX. Designed for context windows larger than one million tokens, this chip is part of the company’s forthcoming Rubin series. The CPX is specifically optimized for processing large sequences of context and is intended for use in a broader disaggregated inference infrastructure approach.

For users, this advancement will result in better performance on long-context tasks such as video generation or software development. Nvidia’s relentless development cycle has driven enormous profits for the company, which reported forty-one point one billion dollars in data center sales in its most recent quarter. The Rubin CPX is slated to become available at the end of 2026.