Progress tracker
AGI Progress
Where are we on the path to artificial general intelligence? Every major lab, researcher, and framework defines it differently. This page tracks them all, maps capabilities, and charts the milestones.
Capability Status
Autonomous Task Execution
Emergingagency
Best: Claude Code / Devin
Tool Use / Function Calling
Achievedagency
Best: Claude / GPT-4
Multi-step Reasoning
Achievedcognition
Best: o3 / DeepSeek-R1
Natural Language Understanding
Achievedcognition
Best: GPT-4o / Claude Opus 4.6
World Models / Physical Understanding
Not Yetcognition
Genuinely Original Creative Work
Disputedcreativity
Autonomous Scientific Research
Emerginginnovation
Best: FutureHouse / various
Long Context (1M+ tokens)
Achievedmemory
Best: Gemini 2.5 Pro
Persistent Cross-Session Memory
Partialmemory
Best: ChatGPT memory / Claude memory
Self-Correction / Reflection
Emergingmetacognition
Best: o1 / Claude extended thinking
Vision + Audio + Text
Achievedperception
Best: GPT-4o / Gemini
Code Generation
Achievedtechnical
Best: Claude Code / Codex
How the Labs Define AGI
Anthropic
AI Safety Levels (ASL)Safety-focused framework tied to deployment policy. Current models operate at ASL-2. ASL-3 would require bio/cyber risk mitigation before deployment.
No meaningful catastrophic risk
Present-day risks, requires current safeguards
Substantially increased risk, enhanced containment needed
Potentially catastrophic autonomous capabilities
Google DeepMind
Levels of AGISix performance levels crossed with breadth (narrow vs general). Published Nov 2023. Defines AGI as general-purpose AI at Level 3+ (Expert).
Narrow non-AI tools
Equal to or somewhat better than unskilled human
At least 50th percentile of skilled adults
At least 90th percentile of skilled adults
At least 99th percentile of skilled adults
Outperforms 100% of humans
OpenAI
Five Levels of AIInternal framework leaked July 2024. OpenAI claimed Level 2 reached with o1. Level 3 (agents) is the current frontier as of early 2026.
Conversational AI with natural language
Human-level problem solving
Systems that can take actions in the world
AI that aids in scientific invention
AI that can do the work of an entire organisation
Researcher Positions
Gary Marcus
Sceptical Position
Argues AGI requires fundamental breakthroughs in reasoning, reliability, and genuine understanding. Current models are sophisticated pattern matchers that fail on novel situations. Consistent critic of AGI hype.
Mustafa Suleiman
Modern Turing Test
Proposed a practical economic test rather than conversational imitation. Focuses on real-world capability and autonomous decision-making.
Shane Legg (DeepMind co-founder)
Original AGI Definition
Defined AGI as "a machine that can do any intellectual task that a human being can." Co-coined the term "Artificial General Intelligence." Has predicted 50% chance of AGI by 2028.
Yann LeCun
World Models Required
Argues current LLMs cannot achieve AGI because they lack world models, persistent memory, and genuine planning. Token prediction is insufficient. Cofounded AMI Labs in 2026 with $1B+ to pursue his approach.
Our Position
Practical AGI Achieved
Position: AGI was functionally achieved with ChatGPT in November 2022. Since then we have been climbing capability levels. The question is no longer "if AGI" but "what level of AGI and how fast."
ChatGPT launch — broad general capability across knowledge domains
o1/o3, DeepSeek-R1 — chain of thought, multi-step problem solving
Claude Code, Codex — autonomous task completion with tool use
Multi-agent systems coordinating work across domains
Self-directed systems that can identify and pursue goals independently
Milestones
Jan 2026
Multi-Agent Systems Mature
Claude Code, Codex, and Gemini working together on shared codebases. Agent coordination becomes practical, not theoretical.
Jul 2025
GPT-5 Release
Significant capability jump across all domains. Pushes frontier of what single-model systems can achieve.
May 2025
Claude Opus 4
Extended thinking, agentic capabilities, sustained complex task execution. Powers multi-session autonomous work.
Feb 2025
Claude Code Launch
Anthropic launches agentic coding tool. Autonomous file editing, terminal access, git operations. The agentic era begins.
Jan 2025
DeepSeek-R1
Open-weight reasoning model from China matching o1 performance at a fraction of the cost. Democratises advanced reasoning.
Sept 2024
o1 Release
OpenAI releases o1 with explicit chain-of-thought reasoning. Claims Level 2 (Reasoners) reached. PhD-level science performance.
May 2024
GPT-4o (Omni)
Omni model with native audio/video understanding. Sub-second voice responses. Free tier access.
Mar 2024
Claude 3 Opus
Anthropic releases Claude 3 family. Opus leads multiple benchmarks. First model widely considered to match GPT-4.
Feb 2024
Sora Demonstration
OpenAI demonstrates photorealistic text-to-video generation. Later shut down in 2026 due to cost and moderation challenges.
Dec 2023
Gemini Release
Google releases Gemini, natively multimodal from the ground up.
Jul 2023
Claude 2
Anthropic releases Claude 2 with 100K context window — 10x what was standard at the time.
Mar 2023
GPT-4 Release
Multimodal model passing bar exam, SAT, and various professional benchmarks. Significant quality jump over GPT-3.5.
Nov 2022
ChatGPT Launch
OpenAI releases ChatGPT. Broad conversational AI available to the public for the first time. Reaches 100M users in 2 months.