Google assembles four-partner chip supply chain with Broadcom, MediaTek, Marvell to challenge Nvidia in inference


Summary: Google is building the AI industry’s most diversified custom chip supply chain, with four design partners (Broadcom, MediaTek, Marvell, Intel) and a roadmap stretching from the Ironwood TPU now shipping in the millions to TPU v8 chips at TSMC 2nm in late 2027. The strategy, detailed ahead of Google Cloud Next, splits the next generation explicitly: Broadcom’s “Sunfish” for training, MediaTek’s “Zebrafish” for inference at 20-30% lower cost, with Marvell in talks to add a memory processing unit and an additional inference TPU, positioning Google’s custom silicon as the most direct challenge to Nvidia’s dominance in AI inference.

Google is assembling the most diversified custom chip supply chain in the AI industry, with four design partners, a fabrication relationship with TSMC, and a product roadmap that now stretches from the inference chips it is shipping today to the 2-nanometre processors it expects to deploy in late 2027. The strategy, detailed in a Bloomberg feature ahead of Google Cloud Next this week, positions Google’s silicon programme as the most direct challenge to Nvidia’s dominance in AI inference, the phase of computing where models serve users rather than learn from data.

The centrepiece is Ironwood, Google’s seventh-generation TPU and the first designed specifically for inference. It delivers ten times the peak performance of the TPU v5p, offers 192 gigabytes of HBM3E memory per chip with 7.2 terabytes per second of bandwidth, and scales to 9,216 liquid-cooled chips in a single superpod producing 42.5 FP8 exaflops. Ironwood is now generally available to Google Cloud customers. Google plans to produce millions of units this year, and Anthropic has committed to up to one million TPUs. Meta also has a rental arrangement.

The four-partner supply chain

Google’s chip programme now involves four distinct design partners, each handling different segments of the product line.

Broadcom, which signed a long-term agreement on 6 April to supply TPUs and networking components through 2031, handles the high-performance chip variants. It is also designing the next-generation TPU v8 training chip, codenamed “Sunfish,” targeted at TSMC’s 2-nanometre process node for late 2027. Broadcom commands more than 70% of the custom AI accelerator market and is projecting $100 billion in AI chip revenue by 2027.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol’ founder Boris, and some questionable AI art. It’s free, every week, in your inbox. Sign up now!

MediaTek is designing the cost-optimised inference variant of the TPU v8, codenamed “Zebrafish,” also targeting TSMC 2nm in late 2027. MediaTek’s involvement began with the I/O modules and peripheral components on Ironwood, where its designs run 20 to 30% cheaper than alternatives. The TPU v8 strategy splits the product line explicitly: Broadcom builds the training chip, MediaTek builds the inference chip, and Google gains the negotiating leverage that comes from having each partner know the other exists.

Marvell Technology, which is in talks with Google to develop a memory processing unit and a new inference-focused TPU, would become the third design partner if those negotiations produce a contract. Google plans to produce nearly two million of the memory processing units, with design finalisation expected by next year. Marvell’s custom silicon business runs at a $1.5 billion annual rate across 18 cloud-provider design wins, and Nvidia invested $2 billion in the company in March.

Intel entered the picture on 9 April with a multi-year deal to supply Xeon processors and custom infrastructure processing units for Google’s AI data centre infrastructure. The arrangement covers the networking and general-purpose compute layers that surround the TPUs rather than the AI accelerators themselves.

TSMC fabricates all of Google’s custom silicon. The relationship is structural: every chip Google designs, regardless of which partner designed it, runs through TSMC’s fabs.

Why inference changes the economics

The shift from training to inference as the dominant AI compute cost is the strategic premise behind Google’s entire chip programme. Training a frontier model is a singular, intensive event. Inference is continuous and scales with every user, every query, and every product that incorporates AI. Google serves billions of AI-augmented search queries, Gemini conversations, and Cloud AI API calls daily. At that scale, the cost per inference determines the economics of the entire AI business.

Nvidia’s GPUs remain dominant for training workloads, where their programmability and the CUDA software ecosystem create switching costs that custom chips cannot easily replicate. But inference workloads are more predictable, more repetitive, and more amenable to the kind of fixed-function optimisation that custom silicon excels at. A purpose-built inference chip that costs less per query than an Nvidia GPU, even if it cannot match the GPU’s versatility, wins on the metric that matters at Google’s scale.

This is why Google is investing in multiple inference chip paths simultaneously. Ironwood serves today’s workloads. MediaTek’s Zebrafish targets the next generation at lower cost. Marvell’s proposed chips would add yet another option. The redundancy is deliberate: Google is building optionality into a supply chain where dependence on any single partner creates pricing risk, capacity risk, and the strategic vulnerability of having its AI infrastructure controlled by someone else’s roadmap.

The numbers behind the ambition

Google’s total expected TPU shipments are projected at 4.3 million units in 2026, scaling to more than 35 million by 2028. Anthropic’s commitment alone represents up to one million of those chips, with access to approximately 3.5 gigawatts of next-generation TPU-based compute starting in 2027. Broadcom’s Mizuho-estimated AI revenue from its Google and Anthropic relationships is $21 billion in 2026, rising to $42 billion in 2027.

The custom ASIC market more broadly is growing faster than GPUs. TrendForce projects custom chip sales will increase 45% in 2026, compared with 16% growth in GPU shipments. The market is expected to reach $118 billion by 2033. Google is not the only hyperscaler building custom inference silicon: Amazon has Trainium and Inferentia, Microsoft has Maia, and Anthropic is exploring its own chip programme. But Google’s multi-partner, multi-generation approach is the most architecturally ambitious.

What to watch at Cloud Next

Google Cloud Next opens on Wednesday in Las Vegas with keynotes from Sundar Pichai and Thomas Kurian. The conference is expected to showcase the next-generation TPU architecture and the custom silicon roadmap that connects Ironwood to the v8 generation. The timing of the Bloomberg feature, one day after The Information broke the Marvell talks and two days before Cloud Next, suggests Google is using the conference to frame its chip programme as a coherent strategy rather than a series of individual partnerships.

The challenge Nvidia faces is not that any single Google chip will outperform its GPUs. It is that Google is building a system in which multiple custom chips, each optimised for a specific workload and cost point, collectively reduce the share of Google’s AI compute that runs on Nvidia hardware. Nvidia’s response has been to embed itself in the custom chip ecosystem rather than fight it: the $2 billion Marvell investment and the NVLink Fusion programme ensure Nvidia retains a position in racks where its GPUs are supplemented or replaced by ASICs.

For Google, the bet is that controlling its own silicon, across multiple partners and multiple generations, will produce a cost advantage in inference that compounds over time. The scale of Nvidia’s business means the incumbent will not be displaced quickly. But the economics of inference favour custom silicon over general-purpose GPUs, and no company has more inference volume than Google. The four-partner supply chain, the dual-track v8 roadmap, and the millions of Ironwood chips shipping this year are the infrastructure for a competitive position that Google expects to strengthen with every query it serves.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


As I’m writing this, NVIDIA is the largest company in the world, with a market cap exceeding $4 trillion. Team Green is now the leader among the Magnificent Seven of the tech world, having surpassed them all in just a few short years.

The company has managed to reach these incredible heights with smart planning and by making the right moves for decades, the latest being the decision to sell shovels during the AI gold rush. Considering the current hardware landscape, there’s simply no reason for NVIDIA to rush a new gaming GPU generation for at least a few years. Here’s why.

Scarcity has become the new normal

Not even Nvidia is powerful enough to overcome market constraints

Global memory shortages have been a reality since late 2025, and they aren’t just affecting RAM and storage manufacturers. Rather, this impacts every company making any product that contains memory or storage—including graphics cards.

Since NVIDIA sells GPU and memory bundles to its partners, which they then solder onto PCBs and add cooling to create full-blown graphics cards, this means that NVIDIA doesn’t just have to battle other tech giants to secure a chunk of TSMC’s limited production capacity to produce its GPU chips. It also has to procure massive amounts of GPU memory, which has never been harder or more expensive to obtain.

While a company as large as NVIDIA certainly has long-term contracts that guarantee stable memory prices, those contracts aren’t going to last forever. The company has likely had to sign new ones, considering the GPU price surge that began at the beginning of 2026, with gaming graphics cards still being overpriced.

With GPU memory costing more than ever, NVIDIA has little reason to rush a new gaming GPU generation, because its gaming earnings are just a drop in the bucket compared to its total earnings.

NVIDIA is an AI company now

Gaming GPUs are taking a back seat

A graph showing NVIDIA revenue breakdown in the last few years. Credit: appeconomyinsights.com

NVIDIA’s gaming division had been its golden goose for decades, but come 2022, the company’s data center and AI division’s revenue started to balloon dramatically. By the beginning of fiscal year 2023, data center and AI revenue had surpassed that of the gaming division.

In fiscal year 2026 (which began on July 1, 2025, and ends on June 30, 2026), NVIDIA’s gaming revenue has contributed less than 8% of the company’s total earnings so far. On the other hand, the data center division has made almost 90% of NVIDIA’s total revenue in fiscal year 2026. What I’m trying to say is that NVIDIA is no longer a gaming company—it’s all about AI now.

Considering that we’re in the middle of the biggest memory shortage in history, and that its AI GPUs rake in almost ten times the revenue of gaming GPUs, there’s little reason for NVIDIA to funnel exorbitantly priced memory toward gaming GPUs. It’s much more profitable to put every memory chip they can get their hands on into AI GPU racks and continue receiving mountains of cash by selling them to AI behemoths.

The RTX 50 Super GPUs might never get released

A sign of times to come

NVIDIA’s RTX 50 Super series was supposed to increase memory capacity of its most popular gaming GPUs. The 16GB RTX 5080 was to be superseded by a 24GB RTX 5080 Super; the same fate would await the 16GB RTX 5070 Ti, while the 18GB RTX 5070 Super was to replace its 12GB non-Super sibling. But according to recent reports, NVIDIA has put it on ice.

The RTX 50 Super launch had been slated for this year’s CES in January, but after missing the show, it now looks like NVIDIA has delayed the lineup indefinitely. According to a recent report, NVIDIA doesn’t plan to launch a single new gaming GPU in 2026. Worse still, the RTX 60 series, which had been expected to debut sometime in 2027, has also been delayed.

A report by The Information (via Tom’s Hardware) states that NVIDIA had finalized the design and specs of its RTX 50 Super refresh, but the RAM-pocalypse threw a wrench into the works, forcing the company to “deprioritize RTX 50 Super production.” In other words, it’s exactly what I said a few paragraphs ago: selling enterprise GPU racks to AI companies is far more lucrative than selling comparatively cheaper GPUs to gamers, especially now that memory prices have been skyrocketing.

Before putting the RTX 50 series on ice, NVIDIA had already slashed its gaming GPU supply by about a fifth and started prioritizing models with less VRAM, like the 8GB versions of the RTX 5060 and RTX 5060 Ti, so this news isn’t that surprising.

So when can we expect RTX 60 GPUs?

Late 2028-ish?

A GPU with a pile of money around it. Credit: Lucas Gouveia / How-To Geek

The good news is that the RTX 60 series is definitely in the pipeline, and we will see it sooner or later. The bad news is that its release date is up in the air, and it’s best not to even think about pricing. The word on the street around CES 2026 was that NVIDIA would release the RTX 60 series in mid-2027, give or take a few months. But as of this writing, it’s increasingly likely we won’t see RTX 60 GPUs until 2028.

If you’ve been following the discussion around memory shortages, this won’t be surprising. In late 2025, the prognosis was that we wouldn’t see the end of the RAM-pocalypse until 2027, maybe 2028. But a recent statement by SK Hynix chairman (the company is one of the world’s three largest memory manufacturers) warns that the global memory shortage may last well into 2030.

If that turns out to be true, and if the global AI data center boom doesn’t slow down in the next few years, I wouldn’t be surprised if NVIDIA delays the RTX 60 GPUs as long as possible. There’s a good chance we won’t see them until the second half of 2028, and I wouldn’t be surprised if they miss that window as well if memory supply doesn’t recover by then. Data center GPUs are simply too profitable for NVIDIA to reserve a meaningful portion of memory for gaming graphics cards as long as shortages persist.


At least current-gen gaming GPUs are still a great option for any PC gamer

If there is a silver lining here, it is that current-gen gaming GPUs (NVIDIA RTX 50 and AMD Radeon RX 90) are still more than powerful enough for any current AAA title. Considering that Sony is reportedly delaying the PlayStation 6 and that global PC shipments are projected to see a sharp, double-digit decline in 2026, game developers have little incentive to push requirements beyond what current hardware can handle.

DLSS 5, on the other hand, may be the future of gaming, but no one likes it, and it will take a few years (and likely the arrival of the RTX 60 lineup) for it to mature and become usable on anything that’s not a heckin’ RTX 5090.

If you’re open to buying used GPUs, even last-gen gaming graphics cards offer tons of performance and are able to rein in any AAA game you throw at them. While we likely won’t get a new gaming GPU from NVIDIA for at least a few years, at least the ones we’ve got are great today and will continue to chew through any game for the foreseeable future.



Source link