Memory is the binding constraint on AI, and the supply gap will not close before 2028

Hyperscalers are pouring half their capital budgets into memory, and the market is already 30% short of demand. The shortage is not a blip. It is a structural wall with a timeline measured in years, not quarters.

By · The Editor

Dwarkesh Patel puts the capital allocation figure plainly: hyperscalers are spending 50% of their CapEx this year on memory. Martin Casado adds context that makes that number land harder. Microsoft, Meta, and Google are each on track to spend over 50% of their revenue on CapEx in 2026. That means memory alone is absorbing something close to a quarter of the total revenue of three of the largest companies on earth. SemiAnalysis and CLSA, in public projections, estimate memory’s share of hyperscaler CapEx could reach 48% by 2027, up from roughly 8% in 2023 and 2024. The shift is not gradual. It is a structural reallocation happening faster than most supply chains can respond.

The demand is colliding with a hard supply ceiling. Patrick O’Shaughnessy reports that DRAM, NAND, and PCB markets are already running approximately 30% short of demand. Dylan Patel argues that true incremental supply from new capacity signals will not arrive until 2028, even with capacity growing 20 to 30 percent per year. The lag is a product of how long it takes to build and qualify semiconductor fabrication capacity. Capital committed today does not produce qualified wafers for years. That gap is the market’s defining feature right now, and it is not going to be resolved by enthusiasm or investment announcements.

The price signal is already visible across the stack. Andrew Feldman points to Micron posting 80 to 85 percent gross margins on HBM as the clearest evidence of how severe the constraint has become. Those are not normal semiconductor margins. They reflect a market where supply is so short that buyers have little choice but to pay. Caitlin Kalinowski predicts memory prices will probably double, and she has started advising startups and companies to pre-buy and stockpile memory if they can afford it, specifically to ride out price spikes. Jake Cooper notes that Railway’s servers have actually appreciated in value as RAM prices rise. Server hardware appreciating. That is the kind of market inversion that signals something structural, not cyclical.

With Blackwell finally, which was deployed in… Maybe last year. You finally have a scale-up on the order of 10-20 terabytes, which is enough for a 5T model plus KV cache. Reiner Pope

Beyond price, the shortage is shaping what AI systems can actually do. Reiner Pope draws a direct line from memory bandwidth constraints to the stall in context length expansion. Models went from roughly 8,000 tokens to 100,000 to 200,000 tokens over the early years of the large language model era. For the past year or two, they have hovered there. Pope’s argument is that the binding limit is not compute. It is HBM memory bandwidth, which has not improved fast enough to make longer contexts cost-viable. The practical ceiling on what models can hold in working memory during inference is being set not by algorithmic progress but by the physics and economics of memory chips.

Pope also notes that Nvidia’s Blackwell generation finally provides scale-up memory on the order of 10 to 20 terabytes, enough to hold a roughly 5 trillion parameter model plus KV cache. That matters because it changes what is possible at inference scale. But it does not dissolve the bandwidth constraint, and it does not address the broader shortage. A cluster built on Blackwell still requires the same HBM that everyone else is competing for.

The ripple effects are reaching well outside data centers. Dwarkesh Patel warns that smartphone shipment volume will fall 30 percent because there is not enough memory to go around. Consumer electronics manufacturers are being crowded out by hyperscaler demand. The analogy to wartime industrial reallocation is imprecise but not entirely off: a dominant buyer is absorbing supply that previously went to other markets, and those markets are shrinking as a result. Optic fiber pricing has moved similarly. Yaroslav Azhnyuk reports that fiber cable used in drones went from roughly four dollars per kilometer to thirty-two dollars per kilometer in a matter of months early this year, driven partly by data center demand pulling on the same supply chains.

What the evidence describes is not a shortage that the market will quietly correct. The timeline Dylan Patel identifies, with true incremental supply arriving no earlier than 2028, means at least two more years of constrained capacity, elevated margins for memory producers, and a widening gap between what AI infrastructure demands and what the supply chain can deliver. The companies that locked in supply early, or that can afford Kalinowski’s advice to stockpile, will hold a meaningful structural advantage. Those that did not are already paying for it.

❦

The Editor, for the readers of Signal Headquarters

Memory is the binding constraint on AI, and the supply gap will not close before 2028

From the Archive

Each piece, in your inbox.