Oracle unveils OCI Zettascale10 AI supercomputer with 16 zettaFLOPS

Tue, 14th Oct 2025

Oracle has introduced Oracle Cloud Infrastructure (OCI) Zettascale10, a cloud-based AI supercomputer that connects hundreds of thousands of NVIDIA GPUs to deliver up to 16 zettaFLOPS of peak performance.

The new system forms the computing backbone for large-scale artificial intelligence workloads and is being utilised as the central element in the Stargate supercluster located in Abilene, Texas, developed in collaboration with OpenAI.

Performance and architecture

OCI Zettascale10 represents a further step in Oracle's AI infrastructure portfolio, expanding on the Zettascale cluster first released in September 2024. According to the company, the Zettascale10 clusters are housed in gigawatt-scale data centre campuses optimised for density within a two-kilometre radius, which is intended to maximise GPU-to-GPU latency performance for demanding AI training tasks.

The system is built upon Oracle Acceleron RoCE networking, which works in tandem with NVIDIA's full-stack AI hardware to create large clusters capable of processing power-hungry AI workloads. The infrastructure aims to provide what Oracle describes as industry-leading price-performance, improved reliability, and high utilisation for enterprise customers seeking to train and deploy their biggest AI models.

"With OCI Zettascale10, we're fusing OCI's groundbreaking Oracle Acceleron RoCE network architecture with next-generation NVIDIA AI infrastructure to deliver multi‐gigawatt AI capacity at unmatched scale," said Mahesh Thiagarajan, Executive Vice President, Oracle Cloud Infrastructure. "Customers can build, train, and deploy their largest AI models into production using less power per unit of performance and achieving high reliability. In addition, customers will have the freedom to operate across Oracle's distributed cloud with strong data and AI sovereignty controls."

Supercomputing for large AI models

The flagship deployment of the Zettascale10 cluster is under way at the Stargate supercluster. The system's custom RoCE (Remote Direct Memory Access over Converged Ethernet) fabric aims to minimise GPU-to-GPU communication latency, which Oracle and its partners cite as essential for scaling up AI research and enterprise deployment.

"OCI Zettascale10 network and cluster fabric was developed and deployed first at the flagship Stargate site in Abilene, Texas - our joint supercluster with Oracle," said Peter Hoeschele, Vice President, Infrastructure and Industrial Compute, OpenAI. "The highly scalable custom RoCE design maximizes fabric-wide performance at gigawatt scale while keeping most of the power focused on compute. We're excited to keep scaling Abilene and the broader Stargate program together."

Initially, Oracle plans to target deployments of up to 800,000 NVIDIA GPUs for customer clusters, which the company says will deliver predictable performance and strong cost efficiency. Connectivity is facilitated by Oracle Acceleron's low-latency RoCEv2 networking.

Industry collaboration

"Oracle and NVIDIA are bringing together OCI's distributed cloud and our full‐stack AI infrastructure to deliver AI at extraordinary scale," said Ian Buck, Vice President of Hyperscale, NVIDIA. "Featuring NVIDIA full-stack AI infrastructure, OCI Zettascale10 provides the compute fabric needed to advance state‐of‐the‐art AI research and help organizations everywhere move from experimentation to industrialized AI."

This partnership leverages Oracle Acceleron's approach to networking. The technology uses switching capabilities built into modern GPU network interface cards (NICs), which can connect to multiple switches on separate network planes. This design allows for enhanced scalability and reliability, as traffic can be redirected across planes in the event of a problem without requiring full system resets.

Networking features

According to Oracle, Oracle Acceleron RoCE networking offers features designed for efficiency and resilience in large-scale AI training. These include the deployment of wide, shallow fabrics that reduce overall networking tiers while increasing scalability, alongside improved reliability by isolating traffic and enabling plane-level rerouting to avoid bottlenecks or failures.

Consistency in GPU-to-GPU latency is aimed for by eliminating a network tier compared to traditional three-tier architectures, which can assist customers requiring predictable performance for both AI training and inference workloads. The networking also supports high-bandwidth optics, specifically Linear Pluggable Optics (LPO) and Linear Receiver Optics (LRO), targeting lower network and cooling costs without a reduction in throughput. This feature is designed to allow more of the data centre's power budget to be allocated to computational tasks rather than networking overhead.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google