Nvidia has announced a new addition to its upcoming Rubin series, the Rubin CPX GPU, which was revealed at the AI Infrastructure Summit on Tuesday.
This processor is designed to handle context windows larger than one million tokens.
Long-Context Inference
The Rubin CPX is tailored for workloads that demand extended memory. Current AI models often struggle to retain more than a limited sequence of data.
By contrast, the CPX allows models to process far larger sequences in one pass. This capability is particularly important for tasks like video generation, advanced coding, and long-form content creation.
In short, the CPX helps AI systems deliver smoother results without losing track of context.
Disaggregated Inference
Nvidia also highlighted the chip’s place in its “disaggregated inference” approach. In this model, AI tasks are divided across specialized hardware rather than concentrated in one system.
The CPX focuses on long-context processing, while other processors manage speed and efficiency.
This division of labor improves scalability. It also reduces bottlenecks when deploying large AI applications.
Finance
Nvidia’s aggressive product cycle has translated into record earnings. The company reported $41.1 billion in data center revenue in its most recent quarter.
The Rubin CPX, expected to launch at the end of 2026, signals Nvidia’s intent to maintain that lead.