Vera Rubin

Nvidia Confirms Vera Rubin Roadmap at GTC

Vera CPU + Rubin GPUs + NVLink rack fabrics, NVL form factors go production‑ready

An isometric illustration depicts stacked server equipment interconnected by glowing data streams beneath a panel featuring a cloud symbol. © The GPU Trade Inc 2026

By Mark J. Harvilla March 17, 2026

At its GTC appearances this year Nvidia doubled down on Vera Rubin as its next‑generation, rack‑scale AI platform and showed production‑ready NVL form factors for customer deployment. The company framed the announcement as proof that its rack‑scale strategy is moving from lab demos to commercial systems.

Vera Rubin is not a single chip but a co‑designed system: Nvidia pairs a Vera CPU with Rubin GPUs inside a shared NVLink domain so the whole rack operates as one accelerator. Nvidia’s technical blog and briefings emphasize that the platform was engineered as a tightly coupled CPU–GPU–interconnect stack.

Nvidia has presented the Vera Rubin NVL72 rack as the flagship form factor and said Rubin series silicon is now in production at its foundry partners, with partner hardware expected to ship in the second half of 2026. The company also outlined larger NVL configurations for more extreme scale.

A central piece of the platform is NVLink 6, the rack‑scale fabric Nvidia described as an ultra‑low latency, all‑to‑all switch that can deliver 3.6 TB/s of bandwidth per GPU and aggregate hundreds of terabytes per rack. Nvidia presented this as the mechanism that lets dozens of GPUs behave like a single, tightly coupled accelerator.

Nvidia also emphasized system‑level features beyond raw FLOPS. The roadmap adds a rack‑scale trusted execution environment—what the company markets as confidential computing across CPU, GPU and NVLink—and a new RAS stack designed for continuous operation and zero‑downtime testing. Those features are aimed at keeping large models and data secure while running at production scale.

Nvidia pitched Vera Rubin as the fulcrum for what it calls agentic AI and “AI factory” deployments — environments that string many rack‑scale accelerators together for continuous training, reasoning, and multi‑phase inference. The company named cloud and hyperscale partners that plan early Rubin rollouts.

The vendor ecosystem surfaced at GTC illustrates how Vera Rubin is intended to reshape server supply chains. Cisco, Dell, HPE, Lenovo and Supermicro were listed as expected OEM partners to deliver Rubin‑based servers, and several cloud hosts were named as initial deployers. That list signals broad OEM engagement but also a tight dependency on a single platform design.

Nvidia’s corporate materials and independent writeups both say Rubin silicon and Vera CPU tape‑outs completed earlier and entered fabrication, and that sampling and production ramps kicked off in late 2025 and early 2026. Nvidia’s public timeline puts broad availability in H2 2026, though partners will stagger their offerings.

Beyond NVL72, Nvidia has previewed very large racks such as NVL576 and liquid‑cooled Kyber placements that push rack power into the hundreds of kilowatts. Those designs underline a trend toward megawatt‑class AI pods and will force data‑center operators to rethink cooling, floor space and power distribution.

Nvidia framed the Vera–Rubin effort as “extreme co‑design”: silicon, interconnect, networking, DPUs and software were developed together to hit sustained utilization and predictable latency rather than peak benchmarks alone. In practice that means new switch silicon, an updated DPU/SMX stack, and software for large‑model execution across the rack.

That breadth of co‑design also creates supply‑chain pressure points. Industry reporting flags risks around advanced packaging capacity, HBM4 memory ramps, and foundry schedules that could shape how fast OEMs and cloud providers can ship Rubin systems at scale. Customers and partners will watch availability windows closely.

For enterprises and cloud buyers, the platform promises consolidation: fewer, denser rack designs that can run training and long‑context reasoning workloads with lower overhead. But it also concentrates buying power and technical dependence, which could shift negotiating leverage toward platform owners and primary OEMs.

What to watch next: proof‑in‑production through partner rollouts in H2 2026, early performance and utilization data from cloud instances, and whether competing chip makers or open‑architecture server vendors can match the same level of co‑design. Nvidia’s GTC messages made clear that Rubins are meant for continuous, production‑grade AI — and the company says the ecosystem is ready to follow.