What is the Einstein Test for AGI?

The Einstein Test was proposed by Demis Hassabis, CEO of Google DeepMind and 2024 Nobel laureate. The idea: train an AI on all human knowledge up to 1911, then see if it can independently discover general relativity as Einstein did in 1915. If it can, we have AGI. If not, we have a sophisticated pattern matcher. Hassabis said plainly that today's systems could not pass this test.

When will AGI be achieved?

Expert predictions vary widely. Elon Musk says 2026. Dario Amodei says 2026-2027. Demis Hassabis gives roughly 50% odds by the end of the decade. Yann LeCun believes current approaches are a dead end and breakthroughs will take years to a decade or more. Current AI systems lack true creativity, continual learning, and long-term planning — capabilities that require new science, not just bigger models.

What capabilities are missing from current AI systems for AGI?

According to Demis Hassabis, current AI lacks three critical capabilities: true creativity (producing genuinely new knowledge, not variations of existing content), continual learning (learning from experience in real time rather than being frozen after training), and long-term planning (sustaining coherent planning over weeks or years). These are not features that can be patched in with a version upgrade — they require fundamental scientific breakthroughs.

The Einstein Test: Why AGI Is Not Around the Corner

Feb 22, 2026 10 min read

Stefanos Damianakis

President, Zaruko

Table of Contents

Demis Hassabis, CEO of Google DeepMind and 2024 Nobel laureate in Chemistry, whose Einstein Test provides the clearest framework for evaluating AGI claims

The AI industry has a prediction problem.

Every few months, another CEO or billionaire declares that Artificial General Intelligence is imminent. Elon Musk says AGI will arrive by the end of 2026. Dario Amodei, CEO of Anthropic, predicts "a country of geniuses in a datacenter" by 2027.3 Geoffrey Hinton warns that AI capabilities are accelerating faster than anyone expected. Sam Altman told the world OpenAI already knows how to build AGI and is looking past it to superintelligence.

I disagree. And I think the evidence supports my position.

Current AI systems, including the most advanced large language models, are not on a path to AGI. They are extraordinary tools. They are not general intelligence. Getting from here to there will require new technologies, new architectures, and scientific breakthroughs that we have not yet made. World models may eventually contribute to the solution, but even that path stretches decades into the future, if it gets there at all.

I could be wrong. But here is why I believe what I believe today.

Demis Hassabis Just Gave Us the Right Test

Among all the noise, Demis Hassabis, CEO of Google DeepMind and 2024 Nobel laureate in Chemistry, recently offered the clearest framing I have seen for how to think about AGI.

Speaking at an interview with Varun Mayya at the Indian Institute of Science (IISc) in Bangalore on February 17, 2026, as part of the India AI Impact Summit, Hassabis proposed what is now being called the "Einstein Test."1

The idea is simple and brilliant: Train an AI system on all human knowledge, but cut off the data at 1911. Then see if the system can independently discover general relativity, as Einstein did in 1915.

If it can, we have AGI. If it cannot, we have a very sophisticated pattern matcher.

Hassabis was direct about where current systems stand: "It's clear today's systems couldn't do that."

This framing matters because it separates two things that the AI industry constantly conflates: performing known tasks well and generating genuinely new knowledge. Every benchmark achievement, every gold medal on math olympiad questions, every impressive coding demo falls into the first category. AGI requires the second.

Put differently: LLMs interpolate. They find patterns within their training data and recombine them. AGI would need to extrapolate, to reach conclusions that do not exist anywhere in the data. I wrote about this distinction in AI Can Write. It Cannot Think. — fluency is not understanding, and pattern matching is not reasoning.

The Jagged Intelligence Problem

Hassabis also identified a problem that anyone who uses these systems extensively will recognize. Speaking at the India AI Impact Summit main stage in New Delhi on February 18, he described current AI as having "jagged intelligence," capable of winning gold medals at the International Math Olympiad one moment and stumbling on relatively simple math problems the next, depending on how the question is framed.2

As Hassabis put it: "That shouldn't happen with a true general intelligence. It shouldn't be a jagged intelligence like that."

I have seen this firsthand. I have built and deployed ML systems. I have watched models produce brilliant output on Monday and fail on the same class of problem Tuesday because the input was phrased slightly differently. In production, this inconsistency is not a minor annoyance. It is the difference between a tool you can trust and one you cannot.

This inconsistency reflects how these systems work. LLMs do not understand math. They have learned statistical patterns about how mathematical text tends to look. When a problem fits a familiar pattern, they solve it brilliantly. When it does not, they fail in ways that would embarrass a first-year student.

What Is Actually Missing

Hassabis has spent three decades working on AI and holds a PhD in neuroscience. His definition of AGI has remained consistent. At the IISc interview, he stated it plainly: "My definition of AGI has never changed. A system that can exhibit all the cognitive capabilities humans can."

He pointed to the human brain as "the only existence proof we have, maybe in the universe, of a general intelligence." That is why he studied neuroscience, to understand the only data point we have that general intelligence is even possible.

At both the IISc interview and the Summit main stage, Hassabis identified specific capabilities that current systems lack:

True creativity. Not generating variations of existing content, but producing something genuinely new. Not playing Go at a world champion level, but inventing a game as elegant as Go. Not solving a physics problem, but formulating an entirely new theory of physics. As Hassabis noted: "True creativity, continual learning, long-term planning. They're not good at those things."

Continual learning. Current systems are frozen in time after training. They cannot learn from experience in real time, adapt to new contexts, or improve through interaction with the world. As Hassabis explained: "What you'd like is for those systems to continually learn online from experience, to learn from the context they're in."

Long-term planning. LLMs can plan over short horizons, but they cannot sustain coherent planning over weeks, months, or years. Hassabis noted: "They can plan over the short term, but over the long term, the way that we can plan over years, they don't really have that capability at the moment."

These are not minor gaps. They are not features you patch in with a version upgrade. They require new science.

Beyond missing capabilities, there are structural problems with how these systems are built:

No understanding of the physical world. LLMs train on text. They can describe gravity, but they do not understand it. They can write about what happens when you push a glass off a table, but only because the words "glass," "fall," and "break" appear together frequently in their training data. They have no internal model of physics, space, or how objects behave. As LeCun has argued, a house cat has a more accurate understanding of the physical world than the most advanced LLM.

No world models or knowledge representation. General intelligence requires an internal model of how reality works, one that allows a system to reason about cause and effect, predict the consequences of actions, and simulate outcomes before acting. Current LLMs have no such structure. They process tokens. They do not represent knowledge in a way that supports reasoning about the world.

No causal reasoning. Current systems find correlations. They identify that certain patterns tend to appear together. But they do not understand why things happen. A system that notices ice cream sales and drowning deaths rise together has found a correlation. A system with causal reasoning understands that summer drives both. That distinction is the difference between statistical pattern matching and actual understanding.

No transfer of learning across domains. Humans learn physics and apply that understanding to cooking, sports, and engineering. We learn about trust in one relationship and carry that knowledge into the next. Current AI systems are siloed. A model that excels at chess knows nothing about strategy in business. A model trained on medical data cannot apply biological reasoning to chemistry without separate training. General intelligence requires the ability to move knowledge between domains the way humans do naturally.

The AGI Timeline Spectrum

The AI community is split on timelines, and the positions are revealing.

On one end, you have the "AGI is imminent" camp. Musk says 2026. Amodei says 2026 or 2027. Altman has been saying "any day now" for years. When timelines align with fundraising cycles, pay attention. These predictions tend to come from people who run companies that need the AGI narrative to justify billion-dollar infrastructure investments and sky-high valuations. That does not make them wrong. But extraordinary claims deserve extraordinary incentives analysis.

On the other end, you have researchers who are more measured. Yann LeCun, the Turing Award winner who recently left Meta to start AMI Labs, has been the most forceful voice here.4 LeCun has called LLMs a "dead end" for achieving AGI. His argument is that text-based systems have no model of the physical world, and without that, they cannot reason about cause and effect, predict consequences of actions, or understand reality in the way that even a house cat does.

LeCun's position aligns with mine: LLMs are useful. They are not a path to general intelligence. Getting there requires entirely new approaches. World models, which learn representations of physical reality through observation rather than text, are a promising direction. But the science is early. LeCun himself has said the breakthroughs needed will take years to a decade or more.5

Hassabis sits somewhere in the middle. He has said there is roughly a 50% chance of AGI by the end of the decade, which places him more optimistic than LeCun but far more conservative than Musk or Amodei. Even that estimate is conditioned on specific research breakthroughs that have not happened yet. And at the India AI Summit, Hassabis said plainly: "I think we're still a few years away from that."

The strongest counterargument to my position is that scale plus better architectures, perhaps including world models, might eventually approximate these capabilities. That is possible. But "possible" is not "imminent," and "approximate" is not "achieve." The gap between where we are and where AGI requires us to be is measured in scientific breakthroughs, not product releases.

Why This Matters for Business Leaders

If you run a company, the practical question is: how should you think about AI investment given this uncertainty?

Deploy AI aggressively for what it does well today. Current AI systems are powerful tools for automation, analysis, content generation, coding assistance, and decision support. The ROI on these applications is real and measurable right now. If you need a framework for where to start, the simple rule is: automate the repeatable, keep humans on the judgment calls.

Do not bet your strategy on AGI arriving soon. If your AI roadmap depends on systems that can think like humans, you are building on sand. Plan for the tools that exist, not the ones that are promised.

Watch for genuine breakthroughs, not benchmark improvements. When someone claims an AI advance, ask yourself: does this system demonstrate true creativity, continual learning, or long-term planning? Or is it just better at pattern matching on familiar tasks? The Hassabis Einstein Test is a useful mental model for separating real progress from incremental improvement.

Be skeptical of timelines from people selling AI products. If AGI were truly two years away, the companies building it would not need to tell you. The technology would speak for itself.

The Bottom Line

We do not have AGI. We have powerful tools. Confusing the two is expensive.

Demis Hassabis gave us the clearest test for AGI I have seen. Train an AI on everything we knew by 1911. See if it can independently derive general relativity. Today's systems cannot come close.

Use AI for what it does well. Build your strategy on reality. And when someone tells you AGI is around the corner, ask them this:

Could your system have discovered general relativity?

I know the answer. So do they.

Sources

Demis Hassabis interview with Varun Mayya at IISc Bangalore, February 17, 2026. Hassabis proposed the "Einstein Test" for AGI and stated "it's clear today's systems couldn't do that." YouTube video | Full transcript ↑
Demis Hassabis remarks at the India AI Impact Summit, New Delhi, February 18, 2026. Described current AI as having "jagged intelligence" and predicted AGI is "still a few years away." News9 coverage | Business Today coverage ↑
Demis Hassabis and Dario Amodei conversation at Davos 2026. Amodei predicted "a country of geniuses in a datacenter" by 2027; Hassabis estimated roughly 50% chance of AGI by end of the decade. Transcript ↑
Yann LeCun interview, MIT Technology Review, January 2026. LeCun left Meta to start AMI Labs, calling LLMs a "dead end" for achieving AGI. AMI Labs profile ↑
Yann LeCun at India AI Impact Summit 2026. Argued that AGI is overhyped and that breakthroughs needed for general intelligence will take years to a decade or more. Capacity coverage ↑

Share this article

LinkedIn X

Navigating AI investment decisions?

I help mid-market companies separate AI reality from AI hype — and build strategies based on what the technology actually does today. Let's talk.

Let's Talk