Discover all the highlights from OCP > VIEW our coverage

Inside SC25: AI Factories From Racks to Qubits

Data Center

November 21, 2025

AI didn’t just show up at Supercomputing 2025 (SC’25) in St. Louis—it took over the agenda. From exabyte-scale storage and 800 Gbps fabrics to liquid-cooled racks and emerging quantum accelerators, SC25 made it clear that the next era of HPC is really about building AI factories end to end.

Below is a structured look at the announcements the TechArena team is tracking, organized around the major layers of the stack.

1. Data and Memory Platforms for Agentic AI

The most urgent theme on the show floor: getting more useful work out of every GPU. That starts with memory and data.

WEKA: Breaking the GPU Memory Wall and Storage Economics

WEKA formally took its Augmented Memory Grid from concept to commercial availability on NeuralMesh, validated on Oracle Cloud Infrastructure (OCI) and other AI clouds. The goal is to extend GPU key-value cache capacity from gigabytes into the petabyte range by streaming KV cache between HBM and flash over RDMA using NVIDIA Magnum IO GPUDirect Storage.

The reported gains are significant: 1000x more KV cache capacity, up to 20x faster time-to-first-token at 128k tokens versus recomputing prefill, and multi-million IOPS performance at cluster scale. For long-context LLMs and agentic AI workflows, that means fewer evictions, less recompute, and better tenant density per GPU — directly attacking inference cost structures on OCI and other platforms.

On the hardware side, WEKA’s next-gen WEKApod appliances push the economics further. WEKApod Prime uses “AlloyFlash” mixed-flash configurations to deliver 65% better price performance while preserving full-speed writes, and WEKApod Nitro focuses on performance density with 800 Gb/s networking via NVIDIA ConnectX-8 SuperNICs. Together, they target AI factories that need high GPU utilization, high density, and lower power per terabyte.

VAST Data + Microsoft Azure: AI OS Meets Cloud Scale

VAST Data is extending its AI Operating System into Microsoft Azure. VAST AI OS will run on Azure’s Laos VM Series with Azure Boost, giving customers a unified “DataSpace” global namespace so they can move between on-prem and Azure without refactoring data pipelines.

InsightEngine and AgentEngine let customers run vector search, RAG pipelines, and agent workflows directly where the data lives, and the underlying disaggregated, shared-everything (DASE) design allows independent scaling of compute and storage. The combined effect is a cloud-native AI operating system tuned for agentic AI pipelines, built to keep Azure’s GPU and CPU fleets saturated.

MinIO ExaPOD: Exabyte as a Design Point, Not an Edge Case

MinIO’s ExaPOD reference architecture plants a big flag for exascale AI data. It’s a 1 EiB usable building block (about 36 PiB usable per rack) that scales linearly in performance and capacity. In the reference design, ExaPOD delivers on the order of 19.2 TB/s aggregate throughput at 1 EiB with 122.88 TB drives, around 900 W of power per PiB including cooling, and modeled all-in economics in the $4.55–$4.60/TiB-month range at exabyte scale.

Built on Supermicro servers, Intel Xeon 6781P, and Solidigm D5-P5336 NVMe, ExaPOD is clearly aimed at hyperscalers, neoclouds, and large enterprises that see exabytes as the new baseline for LLMops, simulations, and observability.

2. Power, Cooling, and the Introduction of PCE

As AI deployments creep toward gigawatt footprints, power and cooling have shifted from “facility detail” to board-level design constraint.

Airsys PowerOne and Aegis: Cooling as a Compute Multiplier

Airsys introduced PowerOne, a modular, multi-medium cooling architecture that scales from 1 MW edge sites to 100+ MW hyperscale data centers. It’s tailored for AI and HPC density with a standard cooling stack (CritiCool-X chiller, FluidCool-X CDU, MaxAir fan wall, Optima2C CRAH) and a LiquidRack spray-cooling architecture that can operate in compressor-less modes with dry coolers where climate allows.

Beyond traditional PUE, Airsys is pushing Power Compute Effectiveness (PCE)—a metric that measures how much provisioned power turns into usable compute. The message is that cooling should unlock stranded power and convert it into AI capacity, not just shave a few basis points off energy overhead.

In parallel, Aegis, an affiliated liquid-cooling arm, is being positioned as an agile R&D hub building two-phase CDUs, cold plates, and control systems using rapid 3D manufacturing to keep pace with AI thermal demands.

Schneider Electric and Motivair: Integrated Power + Liquid Cooling

Schneider Electric is leaning into its acquisition of Motivair, blending global power and infrastructure capabilities with more than 15 years of exascale and accelerated-computing cooling experience. The combined portfolio spans chip-level cold plates, rear-door heat exchangers, CDUs, and facility-level power and control systems.

The through-line is that liquid cooling is now being evaluated as part of a full-stack design conversation with power and infrastructure, especially for hyperscale, co-locators, and high-density AI factories where 100 kW-plus racks are quickly becoming normal.

Iceotope KUL BOX: Liquid-Cooled AI at the Noisy, Messy Edge

Iceotope’s KUL BOX brings the AI factory cooling story out of the core data center and into edge environments that were never designed for dense clusters. It’s a compact, liquid-cooled AI inferencing cluster built as a turn-key system: a 24U rack with six Iceotope KUL AI chassis, up to 24 NVIDIA GPUs, top-of-rack switching, and a fully integrated liquid-cooling loop.

The key twist is deployment model. KUL BOX captures almost all of the system’s heat using Iceotope’s precision immersion cooling and rejects it through a separate liquid-to-air outdoor cooler—meaning it can be installed in locations without existing facility water, dry chillers, or traditional white-space infrastructure.

Iceotope highlights several benefits for edge AI and HPC workloads: consistent GPU throughput and reliability from stable thermals, lower energy and cooling overheads, quiet, fanless operation, and a single-vendor solution that bundles rack assembly, fluids, pipework, logistics, on-site installation, and a three-year service plan. Target use cases include telcos and colocation providers, labs running sensitive compute-heavy tasks, and industrial edge deployments with unusual constraints or sustainability requirements.

3. Server and Computing Hardware

On the compute side, vendors largely converged on the same message: more FLOPS per rack, more memory per GPU, and more network bandwidth behind every accelerator.

Dell Technologies: AI Factory Building Blocks

Dell made its AMD Instinct-powered PowerEdge XE9785 and XE9785L servers generally available and introduced the new Intel-powered PowerEdge R770AP. All three are tuned for demanding AI and HPC workloads as part of the Dell AI Factory with NVIDIA.

On the network side, Dell’s new PowerSwitch Z9964F-ON and Z9964FL-ON switches deliver 102.4 Tb/s of switching capacity, targeting dense AI fabrics. Dell also announced integration of ObjectScale and PowerScale storage systems with NVIDIA’s NIXL library, tightening the connection between storage services and GPU-centered inference stacks.

Supermicro, ASUS, Compal, EnGenius: Dense GPU Nodes and Liquid Cooling

Several OEMs showcased how fast they can pack accelerators into standard racks:

Supermicro highlighted Data Center Building Block Solutions featuring NVIDIA GB300 NVL72 systems with 72 Blackwell Ultra GPUs and liquid cooling up to 200 kW per rack. It also launched a 10U air-cooled AMD Instinct MI355X server that claims up to 4x compute and 35x inference performance versus its predecessor.

ASUS unveiled its XA AM3A-E13 server with eight AMD Instinct MI355X GPUs and dual AMD EPYC 9005 CPUs, offering 288 GB of HBM and up to 8 TB/s of memory bandwidth in a modular 10U chassis. The platform complements ASUS’ broader AI infrastructure portfolio, including NVIDIA GB300-based systems.

Compal brought high-density, liquid-cooled SG720-2A/OG720-2A servers supporting up to eight AMD Instinct MI325X GPUs with forward compatibility for MI355X, plus the SG223-2A-I immersion-cooled system that supports up to eight PCIe GPUs in a 2U chassis.

EnGenius, better known in networking, jumped into the server market with modular Intel Xeon 6-based systems. The flagship 4U EAS5210 can be configured with up to eight Intel Arc Pro B60 accelerators for LLMs and AI training workloads, built on an OCP DC-MHS architecture.

Intel Xeon 6: Keeping CPUs Relevant in AI HPC

Intel used SC25 to emphasize that CPUs still matter in HPC and AI workflows, particularly for simulation, pre/post-processing, and orchestration. The Xeon 6 line targets up to 2.1x faster performance on key HPC workloads like LAMMPS, OpenFOAM, and Ansys Fluent, riding on higher memory bandwidth and built-in AI acceleration.

4. Networking for Gigascale AI: Cornelis and Friends

If storage and cooling are about feeding GPUs and keeping them alive, networking is about making the entire AI factory behave like a single coherent system

Cornelis CN6000 SuperNIC: 800 Gbps, Multi-Protocol, AI-first

Cornelis rolled out its CN6000 SuperNIC, an 800 Gbps adapter that brings its Omni-Path architecture into Ethernet for the first time. CN6000 combines ultra-low latency, up to 1.6 billion messages per second, and full 800 Gbps throughput in a single device.

A key design point is “limitless” RoCEv2 scalability. Traditional RoCEv2 fabrics struggle at scale because managing queue pairs becomes memory-heavy and brittle. Cornelis tackles that with lightweight QPs and a hardware-accelerated RoCEv2 In-Flight table that can track millions of concurrent operations while maintaining predictable latency. The CN6000 is fully compliant with Ultra Ethernet and RoCEv2, positioning it as a standards-based path to 800 Gbps Ethernet fabrics that behave more like purpose-built HPC interconnects.

Cornelis is aligning the CN6000 with next-gen Intel Xeon platforms and working with partners like Intel, AMD, Lenovo, Synopsys, Altair, Atipa, Nor-Tech, Microway, PSSC Labs, and SourceCode to build end-to-end 800G solutions and Omni-Path-based switches and directors.

NVIDIA BlueField-4 and Quantum-X Photonics

On the NVIDIA side, BlueField-4 DPUs continued to show up as a central control plane and offload engine for AI factories. NVIDIA highlighted how storage vendors like DDN, VAST Data, and WEKA are adopting BlueField-4 to push storage services closer to GPUs and eliminate bottlenecks.

NVIDIA also spotlighted Quantum-X Photonics co-packaged optics InfiniBand switches, offering 800 Gb/s per port with significantly better power efficiency than traditional pluggable optics. TACC, Lambda, and CoreWeave are among the operators planning to integrate Quantum-X Photonics into their next-generation systems.

5. HPE, National labs, and Exascale Blueprints

SC25 also reinforced how national labs are shaping the AI/HPC roadmap.

At Oak Ridge National Laboratory, HPE and AMD are partnering on Discovery and Lux—two new systems that blend large-scale simulation with AI training and inference. Lux is positioned as a dedicated AI factory for science and energy, while Discovery focuses on high-bandwidth exascale computing.

At Los Alamos National Laboratory, HPE and NVIDIA are collaborating on Mission and Vision, based on the new HPE Cray GX5000 platform and NVIDIA’s latest CPU/GPU and Quantum-X800 InfiniBand technologies. Mission targets national security workloads; Vision will serve as an unclassified AI and science system and successor to Venado.

For the broader ecosystem, these systems serve as reference architectures for how to co-design CPUs, GPUs, networks, and cooling for converged AI plus simulation workloads.

6. Quantum Computing Edges Closer to Production

Quantum wasn’t the main act at SC25, but it was no longer relegated to the demo corner.

QuEra + Dell: Quantum as Another Accelerator Class

QuEra and Dell demonstrated hybrid quantum-classical workflows where neutral-atom quantum processing units integrate into standard Dell HPC infrastructure via the Dell Quantum Intelligent Orchestrator. The point of the demo: treat quantum as a first-class accelerator alongside CPUs and GPUs instead of a separate science experiment.

Quantum Computing Inc. Neurawave: Photonics for Edge AI

Quantum Computing Inc. (QCi) announced Neurawave, a compact, photonics-based reservoir computing system in a standard PCIe form factor. Operating at room temperature, Neurawave targets edge-AI workloads such as signal processing, time-series forecasting, and pattern recognition, offering fast, energy-efficient processing that complements QCi’s quantum systems.

D-Wave: Annealing as an Energy-Efficient Accelerator

D-Wave highlighted how its Advantage2 annealing quantum computer can tackle combinatorial optimization problems with lower energy use than classical approaches — an angle that resonates as AI and HPC operators watch their power budgets tighten.

7. Other Notable Infrastructure Moves

A few additional announcements round out the “plumbing” for AI factories.

Phison introduced new PCIe Gen5 Pascari X201 and D201 enterprise SSDs, tuned for AI training, hyperscale analytics, and mixed read/write inference workloads. They push Gen5 performance to the edge with high throughput and low latency for data-hungry environments.

Hammerspace showcased its AI solution aligned with the NVIDIA AI Data Platform reference design, providing a unified data foundation for RAG workloads, agentic AI pipelines, and hybrid environments. The goal is to give AI workloads instant access to the right data without re-architecting storage.

TechArena Take: The AI Factory Stack Advances

Stepping back from the logos and part numbers, it’s clear that AI is dominating HPC, driven by policy priorities. SC25 felt like the moment AI factories stepped out of the box and began to rapidly progress.

A few patterns stand out.

First, the bottlenecks have officially moved away from raw FLOPS. The interesting innovation is happening around memory hierarchies, storage fabrics, and KV cache management—exactly the spaces WEKA, VAST, MinIO, and Hammerspace are targeting. The vendors that can prove “more useful tokens per GPU per kilowatt” are going to win the next buying cycle.

Second, power and cooling have been dragged into the AI design conversation whether facilities teams are ready or not. PCE, liquid spray cooling, direct-to-chip loops, 200 kW racks, and now sealed, liquid-cooled edge clusters like Iceotope’s KUL BOX are no longer exotic; they’re becoming prerequisites for deploying Blackwell-scale and inference-heavy clusters wherever the data lives. Cooling is quietly turning into a business lever: whoever can convert the most stranded power into usable compute wins.

Third, the network is being rebuilt around AI assumptions. 800 Gbps Ethernet with Ultra Ethernet, RoCEv2 at scale, CN6000-class SuperNICs, BlueField-4 DPUs, and co-packaged optics all point to the same conclusion: traditional data center Ethernet and “good enough” InfiniBand islands won’t cut it at multi-thousand GPU scale. Deterministic, congestion-free fabrics are table stakes if you want agentic AI to actually run reliably.

Finally, quantum and photonics are edging toward “adjacent accelerators” rather than lab toys. They’re not replacing GPUs any time soon, but they’re already being wired into the same orchestration planes and data fabrics as everything else.

Supercomputing used to be the place to talk about peak FLOPS. In 2025, it quietly turned into the place to advance the entire AI factory—from chip to coolant loop to the edge box bolted to a wall in the field.

‍