Skip to main content
Ra.kib
HomeProjectsResearchBlogContact

Let's build something great together.

Whether you have a project idea, a research collaboration, or just want to say hello — my inbox is always open.

contact@devrakib.commuhammad.rakib2299@gmail.com
HomeProjectsResearchBlogContact
Ra.kib|© 2026Fueled by curiosity
Back to Blog
AGI
Google DeepMind
AI Research
Machine Learning
AI Safety

Google DeepMind's New Framework for Measuring Progress Toward AGI: Everything You Need to Know

Google DeepMind has released a groundbreaking cognitive framework for measuring progress toward Artificial General Intelligence. Here's what it reveals about how close we really are to AGI.

Md. RakibMarch 30, 20267 min read

Google DeepMind's New Framework for Measuring Progress Toward AGI: Everything You Need to Know

One of the most debated questions in artificial intelligence is deceptively simple: How close are we to AGI? Google DeepMind just published a groundbreaking paper that attempts to answer this question with a structured, scientific framework for measuring progress toward Artificial General Intelligence.

This isn't just another think piece or prediction - it's a rigorous cognitive framework that could fundamentally change how the entire AI industry benchmarks progress. Let's dive deep into what they've proposed and what it means.

What Is AGI, Really?

Before we can measure progress toward AGI, we need to define it. And that's been one of the biggest challenges in AI research. Different organizations have different definitions:

  • OpenAI: "Highly autonomous systems that outperform humans at most economically valuable work"
  • Google DeepMind: A system with broad cognitive capabilities matching or exceeding human performance
  • Academic consensus: A machine that can understand, learn, and apply intelligence across any domain

DeepMind's new framework moves beyond these vague definitions to create measurable, testable criteria.

The DeepMind AGI Framework: Key Components

1. Levels of AGI

Rather than treating AGI as a binary (either we have it or we don't), DeepMind proposes a 6-level scale:

LevelNameDescriptionCurrent Example
0No AIRule-based systems onlyBasic calculators
1EmergingEqual to or better than unskilled humansGPT-3, early LLMs
2CompetentAt least 50th percentile of skilled adultsGPT-4, Claude Opus, Gemini 3
3ExpertAt least 90th percentile of skilled adultsNarrow: AlphaFold, coding agents
4VirtuosoAt least 99th percentile of skilled adultsNot yet achieved broadly
5SuperhumanExceeds 100% of humansNarrow: Chess engines, Go engines

2. Breadth vs. Depth Matrix

One of the framework's most important innovations is separating breadth (how many domains) from depth (how well in each domain):

  • Narrow AI: Expert or superhuman in ONE domain (we have this)
  • General AI: Competent or better across MANY domains (this is the goal)
  • Broad AI: Emerging capabilities across many domains (this is roughly where we are)

Current frontier models like Claude, GPT-5, and Gemini 3 fall into what DeepMind calls "Level 2 General" - competent across many tasks but not yet expert-level across the board.

3. Cognitive Capabilities Tested

The framework evaluates AI across these cognitive dimensions:

Reasoning and Problem-Solving

  • Abstract reasoning
  • Causal inference
  • Multi-step logical deduction
  • Novel problem-solving (not pattern matching)

Learning and Adaptation

  • Few-shot learning efficiency
  • Transfer learning across domains
  • Continuous learning without catastrophic forgetting
  • Learning from feedback

Language and Communication

  • Natural language understanding
  • Nuanced communication
  • Multilingual capability
  • Context-appropriate responses

Perception and Interaction

  • Multimodal understanding (text, image, audio, video)
  • Spatial reasoning
  • Temporal understanding
  • Physical world modeling

Social Intelligence

  • Theory of mind
  • Emotional understanding
  • Cultural awareness
  • Collaborative problem-solving

Metacognition

  • Self-awareness of limitations
  • Uncertainty quantification
  • Strategic planning
  • Resource allocation

Where Are We Now?

Based on DeepMind's framework, here's an honest assessment of where current AI stands:

What We've Achieved (Level 2-3)

  • Language understanding: Frontier models comprehend and generate text at expert human levels
  • Code generation: AI can write, debug, and explain code at a professional level
  • Knowledge synthesis: Models can combine information across vast domains
  • Creative tasks: AI produces compelling writing, art, and music

Where We're Still Struggling (Level 1-2)

  • Physical reasoning: Understanding how the real world works remains challenging
  • Long-term planning: Multi-step plans over extended periods still fail frequently
  • Novel problem-solving: AI excels at pattern matching but struggles with truly new problems
  • Common sense: Everyday reasoning that humans find trivial can trip up AI
  • Reliable agency: AI agents still make critical errors in autonomous operation

What's Still Far Away (Level 0-1)

  • True understanding: AI processes information but may not truly "understand" it
  • Consciousness: If it's even relevant to AGI (debatable)
  • Robust generalization: Performing well in genuinely novel situations

Why This Framework Matters

1. It Ends the Hype vs. Doom Debate

Instead of arguing about whether AGI is "2 years away" or "50 years away," we can now have nuanced discussions about specific capabilities and levels. We might achieve Level 3 General AI in 5 years but Level 5 might take 30 years - and that distinction matters enormously.

2. It Guides Research Priorities

By identifying exactly where current AI falls short, researchers can focus their efforts. The framework reveals that metacognition and novel problem-solving are the biggest gaps - these should be priority research areas.

3. It Helps with Safety Planning

Different levels of AGI require different safety measures:

  • Level 2-3: Current alignment techniques may be sufficient
  • Level 4: We need significant advances in interpretability and control
  • Level 5: This is where existential risk discussions become critical

4. It Sets Industry Standards

Having a common measurement framework allows:

  • Meaningful comparison between different AI systems
  • Clear communication about capabilities to the public
  • Informed policy and regulation decisions
  • Better investment allocation in AI research

The Controversy

Not everyone agrees with DeepMind's approach. Critics raise several points:

"You Can't Measure AGI Like This"

Some researchers argue that intelligence is too complex to fit into a neat scale. They worry that optimizing for specific benchmarks might create the illusion of progress without genuine advancement.

"It's Self-Serving"

Skeptics note that frameworks created by AI companies tend to place their own products favorably. DeepMind's Gemini models naturally score well on their own evaluation criteria.

"It Ignores Consciousness"

Philosophers and some researchers argue that true AGI must involve some form of consciousness or understanding, not just functional equivalence. DeepMind's framework deliberately sidesteps this question.

"The Goalposts Will Move"

History shows that as AI achieves milestones, we tend to redefine AGI to exclude them. "Playing chess" was once considered a sign of intelligence; now it's "just computation."

What This Means for the Future

For AI Researchers

This framework provides clear targets. The next frontier is Level 3 General - expert-level performance across diverse cognitive tasks. Key areas to crack:

  1. Robust reasoning that works on novel problems
  2. Reliable long-term planning and execution
  3. True understanding vs. sophisticated pattern matching
  4. Self-aware systems that know what they don't know

For the Industry

Companies can now:

  • Set realistic product roadmaps based on capability levels
  • Communicate more honestly about what their AI can and can't do
  • Plan safety measures appropriate to their system's level
  • Make better hiring and research investment decisions

For Society

The public now has a clearer way to understand:

  • What AI can actually do today (it's impressive but has clear limits)
  • What's coming next (more capable but not omniscient AI)
  • When to be concerned (Level 4+ requires serious safety work)
  • How to plan (education, careers, policy)

My Perspective

Having followed AI research closely, I believe DeepMind's framework is the most useful attempt yet at measuring AGI progress. It's not perfect - no single framework can capture the full complexity of intelligence - but it gives us a shared language and measurement system that the field desperately needs.

We're currently at Level 2 General AI with pockets of Level 3 in specific domains. The jump to Level 3 General is likely achievable within the next few years. But Level 4 and 5 represent fundamentally harder challenges that may require breakthroughs we haven't yet imagined.

The most important takeaway? AGI isn't a single moment or event. It's a gradual progression, and we're already further along than most people realize - while also further away from "true" AGI than the hype suggests.


How do you think we should measure progress toward AGI? Do you find DeepMind's framework useful, or do you think we're missing something fundamental? Share your thoughts below.

Back to all posts
Google DeepMind's New Framework for Measuring Progress Toward AGI: Everything You Need to Know | Md. Rakib - Developer Portfolio