MMLU-Pro

knowledge

MMLU-Pro is a harder version of MMLU with 10 answer choices instead of 4, reducing the chance of guessing correctly. It focuses on more challenging questions that require deeper reasoning.

View paper / source

0

Models Tested

0.0

Average Score

0–100

Scale Range

1.2x

Weight

How It Works

Similar to MMLU but with 10 answer choices per question and harder questions that require multi-step reasoning. The increased number of options makes random guessing far less effective (10% vs 25% baseline).

Why It Matters

As models began saturating the original MMLU benchmark, MMLU-Pro was created to better differentiate between top-performing models. It provides a more reliable signal of genuine understanding.

Limitations

Still relies on multiple-choice format. Being newer, it has less historical data for trend analysis. Some questions may still be solvable through pattern matching rather than true understanding.

Leaderboard — MMLU-Pro

No model scores recorded yet for this benchmark.
All Benchmarks