Agentic AI crossed a real threshold in the past year, and the evidence is specific enough to believe

Agentic and multi-agent AI systems were not viable a year ago and are now. That kind of agreement among independent voices deserves a closer look than the usual hype cycle gets.

By · The Editor

The standard AI hype story runs like this: something ships, someone calls it transformative, the cycle resets. What is happening in agentic AI right now is different in one respect. The speakers making the strongest claims are also supplying the most falsifiable evidence, and several of them are naming the same narrow window of time.

Walden Yan is the most direct. Multi-agent systems, he says, were “very much not at all possible a year ago.” He does not leave that as assertion. He pairs it with a data point: Devin’s commit percentage on its own repos climbed from 16% in January to 80% in March. That is not a trend line; it is a step function, and it covers a span of roughly ten weeks. Numbers that move that fast either reflect a genuine capability shift or a change in what is being measured. Yan treats them as the former.

Geoffrey Irving is working from a narrower window still. Models became good enough for full-on agentic tasks only as of a few weeks before his recording, he said. The week-scale framing is striking precisely because it is so exposed to being wrong. Speakers who want to seem prescient usually hedge. Irving did not.

Roman Chernin offers the most sober version of the claim. Across all the use cases that AI has been pointed at, coding is the one that actually works at scale, and even that only started working a few months ago, as of early 2026. That is a deliberately narrow claim. It does not say AI is everywhere. It says one thing works now that did not work recently, which is the kind of scoped statement that ages better than broad declarations.

We have maybe one use case that works out of so many of use cases and the one use case that works like coding everybody's talking about coding started working like maybe few months ago Roman Chernin

Martin Casado and Cat Wu frame the same shift structurally. Casado places the inflection at the beginning of 2026, when agentic coding moved from “kind of useful” to something more consequential. Wu describes the generational divide in product terms: 2024 products were chat-based; the current generation is action-based. The distinction matters because chat and action have different failure modes, different latency tolerances, and different requirements for reliability. A product built around action cannot fake its way through a task the way a chatbot can redirect a conversation.

Greg Isenberg points to Andrej Karpathy’s public inversion in October and November 2025 as a leading indicator, the moment when Karpathy described flipping from mostly human-written code with AI augmentation to the opposite. Karpathy is the kind of witness whose self-report carries weight: he is precise, he is not trying to sell anything in that context, and the inversion he described is the sort of thing engineers notice when it happens to them personally. Isenberg is citing him as a bellwether, not a data source, which is the right use of that kind of testimony.

Dylan Patel adds the model-capability side of the picture. He characterizes what he calls “Mythos” as potentially the biggest step up in model capabilities in approximately two years. The “potentially” is doing real work there, and the product name comes from a transcript context that leaves some ambiguity, so it deserves to be treated as Patel’s characterization rather than confirmed release nomenclature. But the directional claim, that something large happened at the model level in a short window, fits the pattern the other speakers are describing.

What the accumulated testimony establishes, with enough confidence to be worth stating plainly: agentic AI went from marginal to functional within a period that most of these speakers locate between mid-2025 and early 2026. The evidence is still early, concentrated in coding, and mostly self-reported by people with proximity to the technology. None of that disqualifies it. It means the right response is to watch the next six months closely, not to discount what these voices said when they had no particular incentive to agree with each other.

❦

The Editor, for the readers of Signal Headquarters

Agentic AI crossed a real threshold in the past year, and the evidence is specific enough to believe

From the Archive

The discourse, watching what you care about.