AIME 2025

reasoning

AIME (American Invitational Mathematics Examination) 2025 consists of 15 extremely challenging mathematics problems. AIME is a prestigious competition that serves as a qualifier for the USA Mathematical Olympiad.

7

Models Tested

96.7

Best Score

87.3

Average Score

0–100

Scale Range

1.4x

Weight

How It Works

Models solve 15 problems where each answer is an integer from 0 to 999. Problems require sophisticated mathematical reasoning across algebra, geometry, number theory, and combinatorics. Being from 2025, these problems were unlikely to appear in training data.

Why It Matters

AIME 2025 is particularly valuable because the problems are recent enough to avoid data contamination. The difficulty level (top 5% of US high school mathematicians qualify) makes it an excellent discriminator for frontier model reasoning.

Limitations

Only 15 problems means high variance in scores. Integer-only answers miss the reasoning process. Problems are specifically designed for mathematical competition style, not real-world maths applications.

Leaderboard — AIME 2025

# Model Provider Score
🥇 o3 Pro OpenAI 96.7
🥈 o4 Mini OpenAI 92.7
🥉 o3 OpenAI 91.6
4 Gemini 2.5 Pro Preview 06-05 Google 86.7
5 Grok 3 Beta xAI 83.9
6 R1 DeepSeek 79.8
7 QwQ 32B Alibaba 79.5
All Benchmarks