AI & GPU Accelerators

NextSilicon Maverick-2 HPC Chip Beats Nvidia

Forget AI overload—NextSilicon's Maverick-2 just turbocharged high-performance computing with 10x gains. It's the chip science labs have begged for.

NextSilicon Maverick-2 dataflow chip grid with performance benchmarks overlay

Key Takeaways

  • Maverick-2 claims 10x HPC gains over GPUs with half the power via dataflow magic.
  • Already deployed at Sandia; unmodified code runs smoothly.
  • Revives double-precision computing Nvidia ditched for AI.

HPC’s rebellion starts now.

NextSilicon Maverick-2 isn’t chasing AI dreams—it’s storming the high-performance computing fortress Nvidia abandoned. Picture this: particle physicists crunching collision data, materials scientists simulating alloys, drug hunters modeling proteins. They’ve been stuck with chips optimized for chatbot training, not their precision-hungry workloads. But Maverick-2? It promises up to 10x the performance of modern GPUs, sipping a fraction of the power. And it’s already humming in Sandia National Labs’ Vanguard-II supercomputer.

Look, Nvidia’s GB300 GPUs? A measly 1.3 teraFLOPS of double-precision FP64—down from 45 in the GB200. They’re all-in on FP4 for inference. AMD clings on with Instinct’s 81.7 vector TFLOPS FP64, but NextSilicon claims 4x better perf-per-watt than Nvidia B200, 20x over Intel Sapphire Rapids. In HPCG benchmarks, Maverick-2 hits 600 gigaFLOPS at 750 watts—matching top GPUs on half the juice.

PageRank? 10x graph analytics speed, where GPUs tap out past 25GB datasets. High-throughput databases? 32.6 GUPS, 22x CPUs, 6x GPUs. Unmodified code, they say. No word yet on exact competitors, but the numbers dazzle.

Why NextSilicon Maverick-2 Outruns GPUs in HPC?

Dataflow architecture—think a bustling kitchen alive with motion, not a rigid assembly line. CEO Elad Raz nails it:

“In a traditional processor you have a cookbook (program) that you follow step by step regardless of whether the ingredients (data) are ready,” CEO Elad Raz explained in a blog post. “In a dataflow processor, each cooking station activates the moment its ingredients arrive, working in parallel with other stations.”

No fetching instructions, no decoding bottlenecks. Just a grid of ALUs wired in a graph, firing when data hits. Vast majority of die real estate? Pure compute logic. Groq’s LPUs echo this, but NextSilicon’s compiler swallows C++, Python, Fortran, CUDA—maps straight to hardware. No rewrite hell for devs.

And here’s my bold take, absent from their hype: this echoes the systolic arrays of ’80s supercomputers, like those in Thinking Machines’ Connection Machine. Back then, dataflow-ish designs crushed vector crunching before GPUs stole the show. Maverick-2? It could spark HPC’s renaissance, pulling science from AI’s shadow—imagine climate models 10x faster, fusion breakthroughs unlocked.

But skepticism lingers. Peak TFLOPS? Meh—Top500’s Rmax vs Rpeak gap proves real-world rules. NextSilicon ships 96GB PCIe cards, 192GB OAM modules. Customers like Sandia vouch early. Nvidia’s PR spins AI everywhere; this quietly rebuilds the old guard.

Short para punch: Deployment’s here.

Vanguard-II proves it—no lab fantasies. Engineers ported code smoothly, benchmarks lit up. If compilers hold (big if), supercomputing centers ditch GPU struggles.

Dataflow scales weirdly well. Graphs balloon? Maverick-2 thrives. AI agents need RAG, vector search? It dips in, 6x GPU speed. But core? Double-precision revival. Nvidia traded FP64 for tokens; NextSilicon bets on atoms, quarks, cures.

Can Maverick-2 Save Scientific Computing from AI Overlords?

Absolutely—envision supercomputers as data rivers, not instruction treadmills. Power walls crumble: 750W for GPU-matching HPCG? That’s wildfire efficiency. AMD’s Instinct fights alone; Intel lags. Maverick-2? Fresh blood.

Critique the spin: NextSilicon teases without full specs—vector/matrix FP64? We’ll grill ‘em. Still, unmodified code wins hearts. Historical parallel? Like GaAs chips in Cray-2 versus silicon waves—dataflow might be HPC’s next silicon photonics moment, but for architecture.

Prediction: By 2026, Top500 cracks with Maverick clusters. Nvidia? Watches warily, AI goldmine intact—but sciences bolt. One chip startup shakes the throne.

Energy surges. Pace quickens. Wonder builds.

We’ve seen accelerators flop on programmability—hello, every FPGA pitch. But Sandia’s buy-in? Real. PCIe ease means drop-in upgrades. OAM for dense racks. HPC buyers salivate.

Wander a sec: Remember when GPUs democratized HPC, then AI hijacked ‘em? Maverick-2 flips the script—specializes where generality fails. (Nvidia, take notes.)


🧬 Related Insights

Frequently Asked Questions

What is NextSilicon Maverick-2?

A dataflow accelerator for HPC workloads like simulations and graphs, delivering 10x GPU performance with low power—no AI focus.

How does Maverick-2 compare to Nvidia GPUs?

Up to 10x faster in PageRank, half power in HPCG, 4x perf-per-watt FP64 vs B200; excels where Nvidia skimps on precision.

Will NextSilicon Maverick-2 work for AI?

It handles RAG/vector search 6x GPU speed, but shines in non-AI HPC—compiler takes CUDA if you push it.

Aisha Patel
Written by

Former ML engineer turned writer. Covers computer vision and robotics with a practitioner perspective.

Frequently asked questions

What is NextSilicon Maverick-2?
A dataflow accelerator for HPC workloads like simulations and graphs, delivering 10x GPU performance with low power—no AI focus.
How does Maverick-2 compare to Nvidia GPUs?
Up to 10x faster in PageRank, half power in HPCG, 4x perf-per-watt FP64 vs B200; excels where Nvidia skimps on precision.
Will NextSilicon Maverick-2 work for AI?
It handles RAG/vector search 6x GPU speed, but shines in non-AI HPC—compiler takes CUDA if you push it.

Worth sharing?

Get the best Semiconductor stories of the week in your inbox — no noise, no spam.

Originally reported by The Register HPC

Stay in the loop

The week's most important stories from Chip Beat, delivered once a week.