Falling AI prices are not shrinking the market; they are multiplying it
Every time frontier AI model prices drop, consumption rises by a multiple that defies the original forecast. The pattern has a name, a 160-year-old precedent, and data points accumulating fast enough to treat it as settled behavior rather than theory.
Krishna Rao puts the observation plainly: when Anthropic cut the price of its Opus models, consumption rose far more than anyone had anticipated. That is not a surprising outcome in retrospect. It is, however, a costly one to have underestimated, because the same mechanism is now operating at economy scale.
The principle involved is old. In 1865, the economist William Stanley Jevons observed that more efficient coal-burning steam engines did not reduce total coal consumption. Cheaper, more accessible energy unlocked new applications, and total demand rose. The same rebound effect is now tracking across AI inference. Dara Khosrowshahi, describing how Uber grew, frames it in exactly these terms: when something becomes radically more convenient or cheaper, the market expands well beyond the boundaries anyone originally measured. Those who sized the opportunity by looking at the existing black car and taxi industries missed the actual outcome by orders of magnitude. Khosrowshahi’s point is that Uber’s current scale is a direct result of that expansion, not of capturing a fixed market.
The Cerebras data illustrates the speed of the rebound. Andrew Feldman reports that token spending per engineer at the company went from under $1,000 to $25,000 to $30,000 in eight months. That is not a gradual drift. It is the behavior of a constraint being removed. Cat Wu, describing the pattern among knowledge workers more broadly, observes that every model jump or substantial product improvement causes people to delegate far more tasks to AI and to spend significantly more on tokens as a result. The cost per user rises even as the cost per token falls, because the lower price expands the set of things worth delegating.
We lowered the price of it, but the consumption went up way way more than what you would have expected. Krishna Rao
The supply side is doing its part to keep the rebound going. Harry Stebbings describes a compounding dynamic: raw chip performance improves roughly 3x every 18 months, and additional optimizations, including quantization and related techniques, add another 3x on top of that, producing something close to a 10x improvement in tokens per unit of money every couple of years. Dylan Patel points to DeepSeek as a specific data point inside that trend, having achieved GPT-4-level performance at 1/600th the cost of the original GPT-4. Jason Calacanis adds a structural observation: the model that sits at the frontier today will be 5 to 10x cheaper within a year, simply by virtue of no longer being the frontier model.
The aggregate numbers that follow from this dynamic are large. Patel projects total economy-wide spending on top-tier AI models rising from roughly $40 billion currently to $100 billion by the end of 2026. Stebbings, citing Goldman Sachs forecasts, puts the longer arc at a 24x increase in token consumption by 2030 as agents come online. Gavin Baker ties the revenue side to a structural shift, arguing that the move to usage-based pricing is the primary reason OpenAI and Anthropic will exceed $200 billion in annual recurring revenue this year. That claim is forward-looking enough that it should be read as a projection rather than a result. But the direction it assumes is consistent with every behavioral data point in the current evidence set.
The Uber parallel is worth holding onto, because it also carries a warning about how to read the numbers. Analysts who sized the ride-hailing opportunity by reference to the taxi and black car industries were not being lazy. They were measuring the market that existed. The Jevons mechanism makes that kind of baseline measurement structurally misleading whenever the cost or convenience of a good changes radically. The correct question is not what the current AI token market is worth. It is what new categories of use become economically rational at each successive price point, and how many of those categories exist.
Some of that expansion will reach households directly. Jesse Genet estimates that a family fully using AI across its members will spend somewhere between $2 and $500 per month on tokens. Genet also predicts that once household bills approach $400 per month, cost pressure alone, independent of any privacy concern, will push meaningful adoption toward local models. That is a secondary rebound effect: the cost of cloud consumption eventually creates demand for an entirely different class of infrastructure. The paradox, in other words, does not stop at the first rebound. It propagates. Public reporting tracking the gap between falling per-token prices and rising total AI bills confirms the pattern is already playing out in practice, with engineering teams simultaneously reading about cost reductions and watching their monthly spend increase. The instinct to read efficiency gains as cost savings is understandable. The evidence says it is wrong.