Ranking desk
AI Leaderboard Desk
This page separates three different questions. Frontier now answers what is current. Reliability Floor asks whether everyday basics are covered. Evaluated composite ranks models inside the benchmark-backed scored set.
Important
The old “composite leaderboard” confusion came from treating one scored table as the answer to everything. The frontier lane keeps newer launches visible immediately, while the evaluated composite only ranks models once there is enough public evidence to score them with some confidence.
| Model | State | Price |
|---|---|---|
| GPT-5.4 OpenAI / 1.1M context | tracking tracking | $2.50 / $15.00 |
| Claude Opus 4.6 Anthropic / 1.0M context | scored active | $15.00 / $75.00 |
| Gemini 3.1 Pro Google / 1.0M context | partial tracking | $2.00 / $12.00 |
| Claude Sonnet 4.6 Anthropic / 1.0M context | scored active | $3.00 / $15.00 |
| Grok 4.20 xAI / 2.0M context | tracking tracking | $1.25 / $2.50 |
| Qwen 3.6 Plus Alibaba / 1.0M context | tracking tracking | $0.33 / $1.95 |
| MiniMax M2.7 MiniMax / 205K context | tracking tracking | $0.28 / $1.20 |
| GLM-5 Zhipu AI / 203K context | tracking tracking | $0.60 / $1.92 |
| Kimi K2.5 Moonshot AI / 131K context | tracking tracking | $0.57 / $2.30 |
| Gemma 4 Google / 262K context | tracking tracking Open | $0.12 / $0.36 |
| DeepSeek R1 DeepSeek / 164K context | scored active Open | $0.70 / $2.50 |
| DeepSeek V3.2 DeepSeek / 164K context | scored active Open | $0.20 / $0.77 |
| Nemotron 3 Ultra NVIDIA / 1.0M context | tracking tracking | $0.50 / $2.50 |
| Qwen3.7 Plus Alibaba / 1.0M context | tracking tracking | $0.40 / $1.60 |
| MiniMax M3 MiniMax / 1.0M context | tracking tracking | $0.30 / $1.20 |
| Claude Opus 4.8 Anthropic / 1.0M context | tracking tracking | $5.00 / $25.00 |
| Claude Opus 4.8 (Fast) Anthropic / 1.0M context | tracking tracking | $10.00 / $50.00 |
| Qwen3.7 Max Alibaba / 1.0M context | tracking tracking | $1.25 / $3.75 |
The frontier lane is intentionally not a synthetic score. It keeps the current flagship and launch watchlist visible even when benchmark coverage is still catching up.
Evaluated composite now uses a weighted blend of normalized benchmark results, the existing quality layer, and a freshness signal. It also penalizes thin evidence, stale provider generations, and beta or compact variants so older benchmark saturation does not dominate the story.
Explore the field
Search and filter the full ranked dataset
The top tables are quick reads. This table is for actual decision-making when you want a specific lab, use case, release window, or pricing posture.
| Model | Composite | Coverage | Best for | Price |
|---|---|---|---|---|
| Claude Opus 4.6 Anthropic / 1.0M context | 65.6 | 49% | Chat API | $15.00 / $75.00 |
| GPT-5.2 OpenAI / 400K context | 57.6 | 67% | multilingual API | $1.75 / $14.00 |
| Llama 4 Maverick Meta / 1.0M context | 57.6 | 28% | Coding Open API | $0.15 / $0.60 |
| Claude Sonnet 4.6 Anthropic / 1.0M context | 55.0 | 19% | Chat API | $3.00 / $15.00 |
| Llama 4 Scout Meta / 10.0M context | 50.1 | 13% | Coding Open API | $0.08 / $0.30 |
| GPT-5.2 Pro OpenAI / 400K context | 45.9 | 16% | Chat API | $21.00 / $168.00 |
| GPT-5 Pro OpenAI / 400K context | 43.0 | 16% | Chat API | $15.00 / $120.00 |
| Claude Opus 4.5 Anthropic / 200K context | 41.2 | 16% | Chat API | $5.00 / $25.00 |
| DeepSeek V3.2 DeepSeek / 164K context | 41.2 | 21% | General use Open API | $0.20 / $0.77 |
| O3 OpenAI / 200K context | 39.8 | 64% | multilingual API | $2.00 / $8.00 |
| Gemini 3.1 Pro Google / 1.0M context | 38.2 | 9% | Chat API | $2.00 / $12.00 |
| Gemini 2.5 Pro Google / 1.0M context | 38.1 | 75% | multilingual API | $1.25 / $10.00 |
| Grok 4 xAI / 256K context | 37.1 | 47% | Chat API | $3.00 / $15.00 |
| DeepSeek R1 DeepSeek / 164K context | 36.7 | 58% | General use Open API | $0.70 / $2.50 |
| Claude Opus 4 Anthropic / 200K context | 36.6 | 49% | multilingual API | $15.00 / $75.00 |
| Gemini 3 Pro Google / 1.0M context | 36.2 | 13% | Chat API | $2.00 / $12.00 |
| Phi-4 Reasoning Microsoft / 33K context | 34.5 | 15% | Coding Open API | $0.07 / $0.14 |
| Grok 3 xAI / 131K context | 34.3 | 24% | General use API | $3.00 / $15.00 |
| Claude Sonnet 4 Anthropic / 1.0M context | 34.0 | 38% | General use API | $3.00 / $15.00 |
| Command A Cohere / 256K context | 32.9 | 9% | Coding Open API | $2.50 / $10.00 |
| R1 0528 DeepSeek / 164K context | 32.0 | 23% | Chat Open API | $0.50 / $2.15 |
| GPT-4o OpenAI / 128K context | 31.8 | 50% | General use API | $2.50 / $10.00 |
| O3 Pro OpenAI / 200K context | 31.5 | 20% | Reasoning API | $20.00 / $80.00 |
| o1 OpenAI / 200K context | 31.5 | 21% | General use API | $15.00 / $60.00 |
| Qwen3 Max Alibaba / 262K context | 31.2 | 15% | Coding Open API | $0.78 / $3.90 |
| GPT-4.5 OpenAI / 128K context | 30.3 | 22% | Coding API | $75.00 / $150.00 |
| Llama 3.3 70B Instruct Meta / 131K context | 30.1 | 14% | Coding Open API | $0.10 / $0.32 |
| GPT-5 OpenAI / 400K context | 30.0 | 16% | Chat API | $1.25 / $10.00 |
| GPT-4.1 OpenAI / 1.0M context | 29.5 | 21% | General use API | $2.00 / $8.00 |
| DeepSeek V3 DeepSeek / 131K context | 28.3 | 17% | Coding Open API | $0.20 / $0.80 |
| Claude Sonnet 4.5 Anthropic / 1.0M context | 28.0 | 13% | Chat API | $3.00 / $15.00 |
| Claude Haiku 4.5 Anthropic / 200K context | 27.9 | 11% | Chat API | $1.00 / $5.00 |
| Mistral Large Mistral / 128K context | 27.7 | 20% | General use Open API | $2.00 / $6.00 |
| Qwen3 235B A22B Alibaba / 131K context | 27.5 | 12% | Chat Open API | $0.46 / $1.82 |
| Claude 3.5 Sonnet Anthropic / 200K context | 27.5 | 24% | General use API | $3.00 / $15.00 |
| QwQ 32B Alibaba / 131K context | 26.7 | 15% | Coding Open API | $1.20 / $1.20 |
| Grok 4 Fast xAI / 2.1M context | 26.3 | 7% | Coding API | $0.20 / $0.50 |
| Claude 3.7 Sonnet Anthropic / 200K context | 26.0 | 16% | Chat API | $3.00 / $15.00 |
| O4 Mini OpenAI / 200K context | 25.5 | 15% | Coding API | $1.10 / $4.40 |
| Gemini 3 Flash Preview Google / 1.0M context | 25.4 | 11% | Chat API | $0.50 / $3.00 |
| Qwen2.5 72B Instruct Alibaba / 131K context | 25.2 | 17% | Coding Open API | $0.36 / $0.40 |
| o3 Mini OpenAI / 200K context | 24.3 | 15% | Coding API | $1.10 / $4.40 |
| Gemini 1.5 Pro Google / 2.1M context | 24.3 | 13% | General use API | $1.25 / $5.00 |
| Phi 4 Microsoft / 16K context | 24.2 | 13% | General use Open API | $0.07 / $0.14 |
| Mistral Small 3.1 24B Mistral / 128K context | 24.0 | 9% | Coding Open API | $0.35 / $0.56 |
| Gemini 2.5 Flash Google / 1.0M context | 23.6 | 17% | Coding API | $0.30 / $2.50 |
| Claude 3 Opus Anthropic / 200K context | 23.5 | 17% | General use API | $15.00 / $75.00 |
| Llama 3.1 405B Meta / 131K context | 23.1 | 14% | Coding Open API | $3.00 / $3.00 |
| Grok 3 Mini xAI / 131K context | 22.5 | 11% | Coding API | $0.30 / $0.50 |
| Qwen2.5 Coder 32B Instruct Alibaba / 128K context | 22.4 | 6% | Coding Open API | $0.66 / $1.00 |
| Grok 2 xAI / 131K context | 21.4 | 7% | General use API | $2.00 / $10.00 |
| Nemotron 70B NVIDIA / 131K context | 21.0 | 7% | General use Open API | $0.20 / $0.40 |
| GPT-4.1 Mini OpenAI / 1.0M context | 20.9 | 14% | General use API | $0.40 / $1.60 |
| Claude 3.5 Haiku Anthropic / 200K context | 20.7 | 13% | Coding API | $0.80 / $4.00 |
| GPT-4o-mini OpenAI / 128K context | 20.4 | 13% | Coding API | $0.15 / $0.60 |
| Gemini 2.5 Flash Lite Google / 1.0M context | 20.0 | 9% | Coding API | $0.10 / $0.40 |
| o1-mini OpenAI / 128K context | 18.9 | 11% | Coding API | $3.00 / $12.00 |
| Gemini 2.0 Flash Google / 1.0M context | 14.5 | 4% | Chat API | $0.10 / $0.40 |
| Command A Reasoning Cohere / 256K context | 11.8 | 0% | Frontier tracking Open API | $2.50 / $10.00 |
| Nova 2 Lite Amazon / 1.0M context | 10.4 | 0% | Frontier tracking API | $0.30 / $2.50 |
| Grok 4.1 Fast xAI / 2.1M context | 9.1 | 0% | Frontier tracking API | $0.20 / $0.50 |
| Yi Lightning 01.AI / 16K context | 8.0 | 0% | Frontier tracking Open API | $0.99 / $0.99 |
| Claude Haiku 4 Anthropic / 200K context | 7.2 | 0% | Frontier tracking API | $0.80 / $4.00 |
| Yi Vision 01.AI / 16K context | 7.2 | 0% | Frontier tracking Open API | $1.99 / $1.99 |
| Phi-4 Multimodal Microsoft / 128K context | 6.9 | 0% | Frontier tracking Open API | $0.10 / $0.20 |
| Qwen3 Coder 480B A35B Alibaba / 1.0M context | 6.7 | 0% | Frontier tracking Open API | $0.22 / $1.80 |
| Yi Large 01.AI / 33K context | 5.8 | 0% | Frontier tracking Open API | $3.00 / $3.00 |
| Qwen3 32B Alibaba / 131K context | 5.7 | 0% | Frontier tracking Open API | $0.08 / $0.28 |
| Mistral Medium 3 Mistral / 131K context | 5.7 | 0% | Frontier tracking Open API | $0.40 / $2.00 |
| Qwen3 30B A3B Alibaba / 131K context | 5.7 | 0% | Frontier tracking Open API | $0.09 / $0.45 |
| Codestral 2508 Mistral / 256K context | 5.6 | 0% | Frontier tracking API | $0.30 / $0.90 |
| MiniMax-01 MiniMax / 1.0M context | 5.6 | 0% | Frontier tracking API | $0.20 / $1.10 |
| Nova Premier 1.0 Amazon / 1.0M context | 5.6 | 0% | Frontier tracking API | $2.50 / $12.50 |
| Sonar Pro Perplexity / 200K context | 5.6 | 0% | Frontier tracking API | $3.00 / $15.00 |
| Codestral 25.01 Mistral / 256K context | 5.5 | 0% | Frontier tracking API | $0.30 / $0.90 |
| Sonar Perplexity / 127K context | 5.5 | 0% | Frontier tracking API | $1.00 / $1.00 |
| Nemotron Ultra NVIDIA / 131K context | 5.4 | 0% | Frontier tracking Open API | Open / free |
| GPT-5 Nano OpenAI / 400K context | 5.2 | 0% | Frontier tracking API | $0.05 / $0.40 |
| Command R+ (08-2024) Cohere / 128K context | 5.0 | 0% | Frontier tracking Open API | $2.50 / $10.00 |
| Mistral Large 2407 Mistral / 131K context | 4.9 | 0% | Frontier tracking Open API | $2.00 / $6.00 |
| Pixtral Large Mistral / 131K context | 4.9 | 0% | Frontier tracking API | $2.00 / $6.00 |
| DeepSeek V2.5 DeepSeek / 128K context | 4.9 | 0% | Frontier tracking Open API | $0.14 / $0.28 |
| Nova Pro 1.0 Amazon / 300K context | 4.9 | 0% | Frontier tracking API | $0.80 / $3.20 |
| Llama 3.1 70B Instruct Meta / 131K context | 4.9 | 0% | Frontier tracking Open API | $0.40 / $0.40 |
| Reka Core Reka / 128K context | 4.8 | 0% | Frontier tracking API | $3.00 / $15.00 |
| Jamba 1.5 Large AI21 Labs / 256K context | 4.7 | 0% | Frontier tracking API | $2.00 / $8.00 |
| Command R (08-2024) Cohere / 128K context | 4.7 | 0% | Frontier tracking Open API | $0.15 / $0.60 |
| Mistral Nemo Mistral / 131K context | 4.6 | 0% | Frontier tracking Open API | $0.02 / $0.03 |
| GPT-4.1 Nano OpenAI / 1.0M context | 4.5 | 0% | Frontier tracking API | $0.10 / $0.40 |
| Reka Flash 3 Reka / 66K context | 4.5 | 0% | Frontier tracking API | $0.10 / $0.20 |
| Gemini 2.0 Flash Lite Google / 1.0M context | 4.4 | 0% | Frontier tracking API | $0.07 / $0.30 |
| Llama 3.1 8B Instruct Meta / 131K context | 4.4 | 0% | Frontier tracking Open API | $0.02 / $0.03 |
| Nova Micro 1.0 Amazon / 128K context | 4.4 | 0% | Frontier tracking API | $0.04 / $0.14 |
| Command R7B (12-2024) Cohere / 128K context | 4.3 | 0% | Frontier tracking Open API | $0.04 / $0.15 |
| Gemini 1.5 Flash Google / 1.0M context | 3.9 | 0% | Frontier tracking API | $0.07 / $0.30 |
| Nova Lite 1.0 Amazon / 300K context | 3.8 | 0% | Frontier tracking API | $0.06 / $0.24 |
| Jamba 1.5 Mini AI21 Labs / 256K context | 3.6 | 0% | Frontier tracking API | $0.20 / $0.40 |