What's new

Ranking desk

AI Leaderboard Desk

This page separates three different questions. Frontier now answers what is current. Reliability Floor asks whether everyday basics are covered. Evaluated composite ranks models inside the benchmark-backed scored set.

Important

The old “composite leaderboard” confusion came from treating one scored table as the answer to everything. The frontier lane keeps newer launches visible immediately, while the evaluated composite only ranks models once there is enough public evidence to score them with some confidence.

Model State Price
GPT-5.4 OpenAI / 1.1M context
tracking tracking
$2.50 / $15.00
Claude Opus 4.6 Anthropic / 1.0M context
scored active
$15.00 / $75.00
Gemini 3.1 Pro Google / 1.0M context
partial tracking
$2.00 / $12.00
Claude Sonnet 4.6 Anthropic / 1.0M context
scored active
$3.00 / $15.00
Grok 4.20 xAI / 2.0M context
tracking tracking
$1.25 / $2.50
Qwen 3.6 Plus Alibaba / 1.0M context
tracking tracking
$0.33 / $1.95
MiniMax M2.7 MiniMax / 205K context
tracking tracking
$0.28 / $1.20
GLM-5 Zhipu AI / 203K context
tracking tracking
$0.60 / $1.92
Kimi K2.5 Moonshot AI / 131K context
tracking tracking
$0.57 / $2.30
Gemma 4 Google / 262K context
tracking tracking Open
$0.12 / $0.36
DeepSeek R1 DeepSeek / 164K context
scored active Open
$0.70 / $2.50
DeepSeek V3.2 DeepSeek / 164K context
scored active Open
$0.20 / $0.77
Nemotron 3 Ultra NVIDIA / 1.0M context
tracking tracking
$0.50 / $2.50
Qwen3.7 Plus Alibaba / 1.0M context
tracking tracking
$0.40 / $1.60
MiniMax M3 MiniMax / 1.0M context
tracking tracking
$0.30 / $1.20
Claude Opus 4.8 Anthropic / 1.0M context
tracking tracking
$5.00 / $25.00
Claude Opus 4.8 (Fast) Anthropic / 1.0M context
tracking tracking
$10.00 / $50.00
Qwen3.7 Max Alibaba / 1.0M context
tracking tracking
$1.25 / $3.75

The frontier lane is intentionally not a synthetic score. It keeps the current flagship and launch watchlist visible even when benchmark coverage is still catching up.

Evaluated composite now uses a weighted blend of normalized benchmark results, the existing quality layer, and a freshness signal. It also penalizes thin evidence, stale provider generations, and beta or compact variants so older benchmark saturation does not dominate the story.

Explore the field

Search and filter the full ranked dataset

The top tables are quick reads. This table is for actual decision-making when you want a specific lab, use case, release window, or pricing posture.

Model Composite Coverage Best for Price
Claude Opus 4.6

Anthropic / 1.0M context

65.6 49%
Chat
API
$15.00 / $75.00
GPT-5.2

OpenAI / 400K context

57.6 67%
multilingual
API
$1.75 / $14.00
Llama 4 Maverick

Meta / 1.0M context

57.6 28%
Coding
Open API
$0.15 / $0.60
Claude Sonnet 4.6

Anthropic / 1.0M context

55.0 19%
Chat
API
$3.00 / $15.00
Llama 4 Scout

Meta / 10.0M context

50.1 13%
Coding
Open API
$0.08 / $0.30
GPT-5.2 Pro

OpenAI / 400K context

45.9 16%
Chat
API
$21.00 / $168.00
GPT-5 Pro

OpenAI / 400K context

43.0 16%
Chat
API
$15.00 / $120.00
Claude Opus 4.5

Anthropic / 200K context

41.2 16%
Chat
API
$5.00 / $25.00
DeepSeek V3.2

DeepSeek / 164K context

41.2 21%
General use
Open API
$0.20 / $0.77
O3

OpenAI / 200K context

39.8 64%
multilingual
API
$2.00 / $8.00
Gemini 3.1 Pro

Google / 1.0M context

38.2 9%
Chat
API
$2.00 / $12.00
Gemini 2.5 Pro

Google / 1.0M context

38.1 75%
multilingual
API
$1.25 / $10.00
Grok 4

xAI / 256K context

37.1 47%
Chat
API
$3.00 / $15.00
DeepSeek R1

DeepSeek / 164K context

36.7 58%
General use
Open API
$0.70 / $2.50
Claude Opus 4

Anthropic / 200K context

36.6 49%
multilingual
API
$15.00 / $75.00
Gemini 3 Pro

Google / 1.0M context

36.2 13%
Chat
API
$2.00 / $12.00
Phi-4 Reasoning

Microsoft / 33K context

34.5 15%
Coding
Open API
$0.07 / $0.14
Grok 3

xAI / 131K context

34.3 24%
General use
API
$3.00 / $15.00
Claude Sonnet 4

Anthropic / 1.0M context

34.0 38%
General use
API
$3.00 / $15.00
Command A

Cohere / 256K context

32.9 9%
Coding
Open API
$2.50 / $10.00
R1 0528

DeepSeek / 164K context

32.0 23%
Chat
Open API
$0.50 / $2.15
GPT-4o

OpenAI / 128K context

31.8 50%
General use
API
$2.50 / $10.00
O3 Pro

OpenAI / 200K context

31.5 20%
Reasoning
API
$20.00 / $80.00
o1

OpenAI / 200K context

31.5 21%
General use
API
$15.00 / $60.00
Qwen3 Max

Alibaba / 262K context

31.2 15%
Coding
Open API
$0.78 / $3.90
GPT-4.5

OpenAI / 128K context

30.3 22%
Coding
API
$75.00 / $150.00
Llama 3.3 70B Instruct

Meta / 131K context

30.1 14%
Coding
Open API
$0.10 / $0.32
GPT-5

OpenAI / 400K context

30.0 16%
Chat
API
$1.25 / $10.00
GPT-4.1

OpenAI / 1.0M context

29.5 21%
General use
API
$2.00 / $8.00
DeepSeek V3

DeepSeek / 131K context

28.3 17%
Coding
Open API
$0.20 / $0.80
Claude Sonnet 4.5

Anthropic / 1.0M context

28.0 13%
Chat
API
$3.00 / $15.00
Claude Haiku 4.5

Anthropic / 200K context

27.9 11%
Chat
API
$1.00 / $5.00
Mistral Large

Mistral / 128K context

27.7 20%
General use
Open API
$2.00 / $6.00
Qwen3 235B A22B

Alibaba / 131K context

27.5 12%
Chat
Open API
$0.46 / $1.82
Claude 3.5 Sonnet

Anthropic / 200K context

27.5 24%
General use
API
$3.00 / $15.00
QwQ 32B

Alibaba / 131K context

26.7 15%
Coding
Open API
$1.20 / $1.20
Grok 4 Fast

xAI / 2.1M context

26.3 7%
Coding
API
$0.20 / $0.50
Claude 3.7 Sonnet

Anthropic / 200K context

26.0 16%
Chat
API
$3.00 / $15.00
O4 Mini

OpenAI / 200K context

25.5 15%
Coding
API
$1.10 / $4.40
Gemini 3 Flash Preview

Google / 1.0M context

25.4 11%
Chat
API
$0.50 / $3.00
Qwen2.5 72B Instruct

Alibaba / 131K context

25.2 17%
Coding
Open API
$0.36 / $0.40
o3 Mini

OpenAI / 200K context

24.3 15%
Coding
API
$1.10 / $4.40
Gemini 1.5 Pro

Google / 2.1M context

24.3 13%
General use
API
$1.25 / $5.00
Phi 4

Microsoft / 16K context

24.2 13%
General use
Open API
$0.07 / $0.14
Mistral Small 3.1 24B

Mistral / 128K context

24.0 9%
Coding
Open API
$0.35 / $0.56
Gemini 2.5 Flash

Google / 1.0M context

23.6 17%
Coding
API
$0.30 / $2.50
Claude 3 Opus

Anthropic / 200K context

23.5 17%
General use
API
$15.00 / $75.00
Llama 3.1 405B

Meta / 131K context

23.1 14%
Coding
Open API
$3.00 / $3.00
Grok 3 Mini

xAI / 131K context

22.5 11%
Coding
API
$0.30 / $0.50
Qwen2.5 Coder 32B Instruct

Alibaba / 128K context

22.4 6%
Coding
Open API
$0.66 / $1.00
Grok 2

xAI / 131K context

21.4 7%
General use
API
$2.00 / $10.00
Nemotron 70B

NVIDIA / 131K context

21.0 7%
General use
Open API
$0.20 / $0.40
GPT-4.1 Mini

OpenAI / 1.0M context

20.9 14%
General use
API
$0.40 / $1.60
Claude 3.5 Haiku

Anthropic / 200K context

20.7 13%
Coding
API
$0.80 / $4.00
GPT-4o-mini

OpenAI / 128K context

20.4 13%
Coding
API
$0.15 / $0.60
Gemini 2.5 Flash Lite

Google / 1.0M context

20.0 9%
Coding
API
$0.10 / $0.40
o1-mini

OpenAI / 128K context

18.9 11%
Coding
API
$3.00 / $12.00
Gemini 2.0 Flash

Google / 1.0M context

14.5 4%
Chat
API
$0.10 / $0.40
Command A Reasoning

Cohere / 256K context

11.8 0%
Frontier tracking
Open API
$2.50 / $10.00
Nova 2 Lite

Amazon / 1.0M context

10.4 0%
Frontier tracking
API
$0.30 / $2.50
Grok 4.1 Fast

xAI / 2.1M context

9.1 0%
Frontier tracking
API
$0.20 / $0.50
Yi Lightning

01.AI / 16K context

8.0 0%
Frontier tracking
Open API
$0.99 / $0.99
Claude Haiku 4

Anthropic / 200K context

7.2 0%
Frontier tracking
API
$0.80 / $4.00
Yi Vision

01.AI / 16K context

7.2 0%
Frontier tracking
Open API
$1.99 / $1.99
Phi-4 Multimodal

Microsoft / 128K context

6.9 0%
Frontier tracking
Open API
$0.10 / $0.20
Qwen3 Coder 480B A35B

Alibaba / 1.0M context

6.7 0%
Frontier tracking
Open API
$0.22 / $1.80
Yi Large

01.AI / 33K context

5.8 0%
Frontier tracking
Open API
$3.00 / $3.00
Qwen3 32B

Alibaba / 131K context

5.7 0%
Frontier tracking
Open API
$0.08 / $0.28
Mistral Medium 3

Mistral / 131K context

5.7 0%
Frontier tracking
Open API
$0.40 / $2.00
Qwen3 30B A3B

Alibaba / 131K context

5.7 0%
Frontier tracking
Open API
$0.09 / $0.45
Codestral 2508

Mistral / 256K context

5.6 0%
Frontier tracking
API
$0.30 / $0.90
MiniMax-01

MiniMax / 1.0M context

5.6 0%
Frontier tracking
API
$0.20 / $1.10
Nova Premier 1.0

Amazon / 1.0M context

5.6 0%
Frontier tracking
API
$2.50 / $12.50
Sonar Pro

Perplexity / 200K context

5.6 0%
Frontier tracking
API
$3.00 / $15.00
Codestral 25.01

Mistral / 256K context

5.5 0%
Frontier tracking
API
$0.30 / $0.90
Sonar

Perplexity / 127K context

5.5 0%
Frontier tracking
API
$1.00 / $1.00
Nemotron Ultra

NVIDIA / 131K context

5.4 0%
Frontier tracking
Open API
Open / free
GPT-5 Nano

OpenAI / 400K context

5.2 0%
Frontier tracking
API
$0.05 / $0.40
Command R+ (08-2024)

Cohere / 128K context

5.0 0%
Frontier tracking
Open API
$2.50 / $10.00
Mistral Large 2407

Mistral / 131K context

4.9 0%
Frontier tracking
Open API
$2.00 / $6.00
Pixtral Large

Mistral / 131K context

4.9 0%
Frontier tracking
API
$2.00 / $6.00
DeepSeek V2.5

DeepSeek / 128K context

4.9 0%
Frontier tracking
Open API
$0.14 / $0.28
Nova Pro 1.0

Amazon / 300K context

4.9 0%
Frontier tracking
API
$0.80 / $3.20
Llama 3.1 70B Instruct

Meta / 131K context

4.9 0%
Frontier tracking
Open API
$0.40 / $0.40
Reka Core

Reka / 128K context

4.8 0%
Frontier tracking
API
$3.00 / $15.00
Jamba 1.5 Large

AI21 Labs / 256K context

4.7 0%
Frontier tracking
API
$2.00 / $8.00
Command R (08-2024)

Cohere / 128K context

4.7 0%
Frontier tracking
Open API
$0.15 / $0.60
Mistral Nemo

Mistral / 131K context

4.6 0%
Frontier tracking
Open API
$0.02 / $0.03
GPT-4.1 Nano

OpenAI / 1.0M context

4.5 0%
Frontier tracking
API
$0.10 / $0.40
Reka Flash 3

Reka / 66K context

4.5 0%
Frontier tracking
API
$0.10 / $0.20
Gemini 2.0 Flash Lite

Google / 1.0M context

4.4 0%
Frontier tracking
API
$0.07 / $0.30
Llama 3.1 8B Instruct

Meta / 131K context

4.4 0%
Frontier tracking
Open API
$0.02 / $0.03
Nova Micro 1.0

Amazon / 128K context

4.4 0%
Frontier tracking
API
$0.04 / $0.14
Command R7B (12-2024)

Cohere / 128K context

4.3 0%
Frontier tracking
Open API
$0.04 / $0.15
Gemini 1.5 Flash

Google / 1.0M context

3.9 0%
Frontier tracking
API
$0.07 / $0.30
Nova Lite 1.0

Amazon / 300K context

3.8 0%
Frontier tracking
API
$0.06 / $0.24
Jamba 1.5 Mini

AI21 Labs / 256K context

3.6 0%
Frontier tracking
API
$0.20 / $0.40