Occam's Inversion
Last week, the best AI system in the world could autonomously resolve about 81% of real-world software engineering tasks. This week, a new model hit 93.9%. Same evaluation, same problems, no change to the test. If you have been tracking AI progress as a smooth curve, that number should stop you. What is arriving now, across coding, reasoning, cybersecurity, and autonomous problem-solving, is a series of discontinuous jumps, each one resetting assumptions that were reasonable the month before.
Most people carry a mental model of AI progress that goes something like this: more data, bigger model, better results, diminishing returns, plateau. A recent paper from AI researchers Alessandro Achille and Stefano Soatto provides a formal framework for why that model no longer holds. Their argument is that we have crossed from systems that recognize patterns to systems that extract transferable reasoning structure from experience.
Consider a codebreaker. A pattern-matching approach studies thousands of past decryptions and memorizes which substitutions tend to appear. A structural approach is different: you learn that certain letters appear with predictable frequency in any language, that doubled letters constrain possibilities, that word boundaries create exploitable regularities. You extract principles that compress the search space for any cipher, including ones built on methods you have never encountered. The first approach improves linearly with more examples. The second compounds.
The \"jagged\" frontier of AI, where models are breathtakingly good at some tasks and oddly poor at adjacent ones, is the signature of this kind of learning: uneven but deep, closing gaps in bursts as structural insights land in specific domains. You cannot predict which gap closes next. You can only observe that the gaps are closing faster than anyone expected, and that each closure opens capabilities that were not on anyone's roadmap.
The underlying capability curve may be continuous at the level of training compute. But the experience of it is not. When a system extracts a structural insight that transfers across problems, the result is not gradual improvement. It is a phase change. Capabilities that were absent on Tuesday are present on Wednesday. A model that could not reliably plan a multi-step software fix last week can now resolve it autonomously. That is a step function, and the 81-to-94 jump is what one looks like from the outside. The organizational implications follow from the experience, not the math.
The Achille and Soatto paper contains a counterintuitive inversion. Classical machine learning theory, rooted in Occam's Razor, holds that simpler models generalize better. For pattern matching, this is true. For reasoning, the relationship reverses: the more complex the domain, the more structural learning pays off. Simple problems offer little transferable structure to extract. Complex ones are rich with it.
The hardest, messiest, most irreducibly complex problems are precisely where AI reasoning is improving fastest. The ones organizations have labeled \"too hard for AI\" or sheltered behind as a reason for measured caution are the ones most exposed to rapid gains. The complexity is not a shield. It is an accelerant.
And this is where something has shifted that most organizations have not yet absorbed. For years, the binding constraint on AI's impact was technical capability. The models were not good enough, not reliable enough, not versatile enough. That constraint is loosening rapidly. The binding constraint now, for most organizations, is the capacity to adapt. AI progress is increasingly gated not by what the technology can do but by how quickly institutions can absorb what it already does.
Posture
If you are leading an organization through this, the question is not whether the step functions will continue. The question is what posture you hold when they arrive.
The first posture is ahead. These organizations built internal AI capability before the latest jump forced the question. They ran real workloads against real models, learned where the tools failed, and developed institutional judgment about what to trust and what to verify. Each step function extends their lead because they have already paid the cost of learning. They are not predicting the next jump. They are running experiments that will tell them what the next jump means for their business within days of its arrival, not months.
The second is behind. Each step function arrives as a shock, triggering a scramble to understand its implications and mount a response. By the time that response is mobilized, the ground has shifted again. Behind is not a position you choose. It is a position you drift into by treating AI as a project rather than a posture. It produces turbulence, zigging when you should be zagging, made worse by the fact that most organizations are structurally slower than the rate of change requires.
The third is structurally resistant, and it is not a harder version of being behind. It is a different problem entirely. Regulated industries, large enterprises with deep governance, public sector institutions: these organizations are not slow by accident. They are slow by design. Their regulatory frameworks, risk cultures, approval processes, and procurement cycles were built to prevent rapid change, because in the world they were designed for, rapid change was synonymous with risk. The organizations that are behind face a velocity problem. They need to move faster. The structurally resistant face a physics problem. The institutional forces actively oppose the motion they need. You cannot accelerate a system whose design function is to resist acceleration. You have to change what the system is optimizing for. And the longer that takes, the wider the gap grows between what the technology makes possible and what the institution can act on.
Here is where everything in this piece converges. The Occam's inversion tells us that the most complex domains are the ones where AI reasoning will advance most dramatically. The structurally resistant organizations tend to operate in those exact domains: healthcare, financial services, legal, critical infrastructure, government. The institutions least equipped to absorb discontinuous change are the ones most exposed to it. And the very complexity they have used to justify caution is the reason the step functions heading their way will be among the largest.
Were we wrong or just early? Yes
Some will ask whether they got it wrong. Whether the strategies and investments set in the last cycle were mistakes. Yes and no. Some of those bets are already stale, because nobody can predict what a step function looks like before it arrives. But the bets that hold are the ones that were never about a specific capability in the first place: building the muscle to evaluate new tools fast, keeping decision loops short, developing people who can distinguish signal from noise at the frontier. A strategy built on proximity to the frontier holds across discontinuities. A strategy built on a specific snapshot of what AI can do has a shelf life measured in months.
The instinct in large organizations facing this uncertainty is to standardize: pick an approach, codify it, enforce it, call it scale. In a world of continuous change, that works. In a world of step functions, it is scaling the wrong thing. The pilot that worked last quarter may be irrelevant after the next jump. The vendor you standardized on may be leapfrogged by a model that did not exist when you signed the contract. Scaling a solution across discontinuities produces organizations that have AI everywhere and insight nowhere. The alternative is to scale the capacity for judgment instead: the ability to evaluate, adopt, and abandon tools at the speed the technology demands.
The organizations that navigate this well will not be the ones that predicted the right curve. They will be the ones that built for a world where the ground shifts without warning, repeatedly, and each shift carries more consequence than the last. In a step-function world, the advantage belongs to those who can stand on new ground the fastest.