Cerebras Unveils WSE-4 Wafer-Scale Chip With 7 Trillion Transistors for AI Training

Technology·3 min read
Close-up of a circuit board with intricate electronic components and copper traces

Cerebras Systems, the AI chip startup known for building processors the size of dinner plates, has announced its fourth-generation Wafer-Scale Engine. The WSE-4 contains 7 trillion transistors fabricated on a single silicon wafer, making it the largest and most complex chip ever produced. The company says the new processor can train frontier AI models up to four times faster than its predecessor while consuming significantly less energy per computation.

Bigger Than Anything Else

To understand what makes Cerebras unusual, consider scale. A typical NVIDIA H100 GPU — the workhorse of modern AI data centers — contains 80 billion transistors on a chip roughly the size of a postage stamp. The Cerebras WSE-4 is the size of an entire 300mm silicon wafer, approximately 46,225 square millimeters of active compute area. It contains nearly 90 times more transistors than the H100.

This size advantage translates directly into performance for certain workloads. AI model training involves massive matrix multiplications that benefit from keeping data close to the processing cores. On a conventional GPU cluster, data must constantly shuttle between individual GPUs over network interconnects, creating bottlenecks. On a wafer-scale chip, the entire model can reside on a single device with on-chip memory bandwidth that dwarfs anything achievable across a network.

"The memory wall is the fundamental bottleneck in AI training, and we've moved the wall," said Andrew Feldman, Cerebras co-founder and CEO. "With WSE-4, a model with up to 600 billion parameters can train entirely on-chip without any off-chip memory access during forward and backward passes."

Technical Details

The WSE-4 is fabricated on TSMC's N3E process node, a step up from the N5 process used in the WSE-3. It features 1.2 million AI-optimized cores, 300 GB of on-chip SRAM, and 700 petabytes per second of internal memory bandwidth. External connectivity comes via 1.2 terabits per second of I/O bandwidth to attached storage and networking.

Cerebras has also redesigned its software stack, called Cerebras Software Platform 4.0, to support the latest model architectures including mixture-of-experts, state-space models, and diffusion transformers. The platform now includes an automated model-partitioning tool that can split models larger than 600 billion parameters across multiple WSE-4 systems while minimizing inter-chip communication.

Power consumption is rated at 28 kilowatts per WSE-4 system, which sounds staggering for a single chip but is considerably less than the 40-60 kilowatts consumed by a rack of NVIDIA GPUs delivering comparable training throughput.

The NVIDIA Challenge

Despite its technical achievements, Cerebras faces the same challenge that has confronted every NVIDIA competitor: the software ecosystem. NVIDIA's CUDA platform has been the dominant programming framework for GPU computing for over 15 years, and virtually every AI research lab and cloud provider has built its infrastructure around it.

Cerebras counters this by offering full compatibility with PyTorch, the most widely used AI research framework, and has recently added JAX support. The company claims that most existing training scripts can run on Cerebras hardware with minimal modification, though independent benchmarks to verify this claim are still limited.

The startup has secured notable customers including GlaxoSmithKline for drug discovery, the Mayo Clinic for medical imaging AI, and several undisclosed defense contractors. However, its total installed base remains a fraction of NVIDIA's. Analysts at Semianalysis estimate that Cerebras shipped roughly 200 WSE-3 systems in 2025, compared to hundreds of thousands of NVIDIA GPUs sold into data centers during the same period.

Funding and Market Position

Cerebras has raised over $4.7 billion in total funding and was last valued at $14 billion following a Series G round in late 2025. The company filed for an IPO in 2024 but postponed it amid market volatility. Feldman said the WSE-4 launch positions the company for a public listing "when conditions are right," without providing a timeline.

The AI chip market is projected to reach $180 billion by 2028, according to Gartner. NVIDIA currently controls an estimated 80 percent of the AI training accelerator market, but growing demand and power constraints in data centers have created openings for alternative architectures.

The WSE-4 will be available for order in Q3 2026, with first deliveries expected in Q4. Pricing has not been disclosed but is expected to run into the millions per system.

Share

Related Stories