What's new
2h ago Recomputed benchmark-weighted quality scores Refreshed the model quality layer that feeds ranking and comparison pages. 2h ago Synced Chatbot Arena benchmark track Updated the frontier conversation signal used in leaderboard weighting. 2h ago Updated speed measurements Refreshed output speed and latency references for tracked models. 2h ago Validated official pricing snapshots Rechecked provider pricing pages against the comparison database. 2h ago Pulled latest OpenRouter price index Updated comparison data for providers and routed model endpoints. 6h ago Jobs market snapshot refreshed 1,058 open roles across 10 tracked companies. 5d ago OpenGuardrails: An Open-Source Context-Aware AI Guardrails Platform Papers With Code featured in the latest daily brief. 5d ago Published the 2026-05-25 daily digest 7 stories captured from tracked sources. 2h ago Meta is reportedly developing an AI pendant TechCrunch 2h ago I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful TechCrunch 2h ago Recomputed benchmark-weighted quality scores Refreshed the model quality layer that feeds ranking and comparison pages. 2h ago Synced Chatbot Arena benchmark track Updated the frontier conversation signal used in leaderboard weighting. 2h ago Updated speed measurements Refreshed output speed and latency references for tracked models. 2h ago Validated official pricing snapshots Rechecked provider pricing pages against the comparison database. 2h ago Pulled latest OpenRouter price index Updated comparison data for providers and routed model endpoints. 6h ago Jobs market snapshot refreshed 1,058 open roles across 10 tracked companies. 5d ago OpenGuardrails: An Open-Source Context-Aware AI Guardrails Platform Papers With Code featured in the latest daily brief. 5d ago Published the 2026-05-25 daily digest 7 stories captured from tracked sources. 2h ago Meta is reportedly developing an AI pendant TechCrunch 2h ago I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful TechCrunch
Refreshed 2h ago

Frontier - composite leaderboard

12 of 97
# Model Provider AIRH Composite AIRH Real-World AIRH Value Delta 7d / Trend $ in / out Speed Ctx Tags
1 Claude Opus 4.6 claude-opus-4.6 Anthropic 67.2 89.0 15 flat $15.00 / $75.00 50 1M multimodalapi
2 GPT-5.2 gpt-5.2 OpenAI 57.6 90.0 82 flat $1.75 / $14.00 85 400K multimodalapi
3 Llama 4 Maverick llama-4-maverick Meta 57.6 76.0 1,559 flat $0.15 / $0.60 95 1.0M openmultimodal
4 Claude Sonnet 4.6 claude-sonnet-4.6 Anthropic 55.0 86.0 72 flat $3.00 / $15.00 90 1M multimodalapi
5 Llama 4 Scout llama-4-scout Meta 50.1 79.0 3,224 flat $0.08 / $0.30 120 10M openmultimodalfast
6 GPT-5.2 Pro gpt-5.2-pro OpenAI 45.9 93.0 7 flat $21.00 / $168.00 pending 400K multimodalapi
7 GPT-5 Pro gpt-5-pro OpenAI 43.0 90.0 10 flat $15.00 / $120.00 pending 400K multimodalapi
8 Claude Opus 4.5 claude-opus-4.5 Anthropic 41.2 86.0 43 flat $5.00 / $25.00 pending 200K multimodalapi
9 DeepSeek V3.2 deepseek-v3.2 DeepSeek 41.2 77.0 1,227 flat $0.20 / $0.77 49 164K open
10 O3 o3 OpenAI 39.8 88.0 135 flat $2.00 / $8.00 15 200K reasoningmultimodalapi
11 Gemini 3.1 Pro gemini-3.1-pro Google 38.2 96.0 101 flat $2.00 / $12.00 80 1.0M multimodalapi
12 Gemini 2.5 Pro gemini-2.5-pro Google 38.1 83.0 106 flat $1.25 / $10.00 90 1.0M multimodalapi
Showing 12 of 97 - sorted by AIRH Composite Open full leaderboard →

Pulse - current table

97ranked public LLM rows

37open-weight rows

32rows with speed data

Value leader: Mistral Nemo