Glean

Glean Adds NVIDIA Nemotron 3 Ultra Support

Enterprise platform adds a lower‑cost open model for agentic workloads

Enterprise platform adds a lower‑cost open model for agentic workloads

Glean announced on June 4, 2026 that its enterprise Work AI platform will support NVIDIA’s Nemotron 3 Ultra, giving customers a new open‑model option for agentic tasks inside Glean’s orchestration layer.

NVIDIA positions Nemotron 3 Ultra as a high‑efficiency, open model family member tuned for agentic applications — designed to deliver strong reasoning and multimodal capabilities with a focus on throughput and cost efficiency.

The Nemotron 3 family has been promoted by NVIDIA as broadly available to developers and enterprises through multiple distribution channels, with Nemotron 3 Ultra expected to be accessible at launch via public model hubs and partner inference services.

Glean frames the integration as a practical way for enterprises to balance cost and capability across workflows. Its platform combines connectors, an enterprise graph and an agentic engine so teams can route tasks to different models based on context, cost and required capability.

For companies running agentic workloads — think autonomous workflows that plan, act and follow up inside corporate systems — Nemotron 3 Ultra offers an open alternative to expensive closed frontier models, Glean said in its announcement. The vendor pitches this as a way to lower inference bills while keeping stronger models on critical paths.

Practically, the move means organizations using Glean can configure agents to call Nemotron 3 Ultra for routine reasoning, escalate to higher‑capability models for complex decisions, or blend outputs across models inside a single context stream. That multi‑model orchestration is central to Glean’s Work AI strategy.

Partner support and inference ecosystems are already lining up to serve Nemotron 3 Ultra at launch. Third‑party inference platforms and vendors announced day‑one or near‑day‑one compatibility to help enterprises run the model in production.

Glean emphasized secure enterprise controls: models run inside its connectors and enterprise memory, which keep data, prompts and retrieval grounded in corporate context and policy frameworks. That aims to reduce leakage and enforce access controls when agents act on sensitive internal systems.

There are limits and tradeoffs to watch. Open, efficient models like Nemotron 3 Ultra trade some top‑end benchmark headroom for throughput and cost; enterprises will need to validate accuracy, safety and latency for each use case and train guardrails for agentic behavior. NVIDIA’s technical notes and early vendor tests underline both the gains and the areas requiring careful evaluation.

The announcement also sits inside a broader industry push toward open agentic stacks. NVIDIA’s Nemotron coalition and related partnerships are pushing open models and inference tooling at scale, and enterprise software vendors are racing to offer orchestration layers that let customers mix and match models.

For enterprise buyers, the Glean–Nemotron 3 Ultra pairing signals a maturing market where platform vendors provide not just a single model but an operational control plane for cost, context and compliance. Enterprises that pilot the integration should measure task‑level cost, accuracy and safety to decide where open models like Nemotron 3 Ultra fit in their agentic stacks.