MMMU-Pro
multimodalMMMU-Pro is a harder version of MMMU that eliminates shortcut strategies. It tests multimodal understanding across 30 subjects and 6 disciplines with 30 different image types including charts, diagrams, and chemical structures.
4
Models Tested
72.0
Best Score
63.8
Average Score
0–100
Scale Range
1.2x
Weight
How It Works
Models answer questions about images that require genuine understanding — not just pattern matching. The "Pro" version removes answer choices that could be eliminated without understanding the image, making it a purer test.
Why It Matters
As models get better at MMMU through shortcuts, MMMU-Pro ensures scores reflect genuine multimodal understanding. It's the gold standard for testing whether AI truly comprehends visual academic content.
Limitations
Requires multimodal input so text-only models cannot be tested. Newer benchmark with less historical data. Some image types (e.g. music sheets) may be underrepresented in training data.
Leaderboard — MMMU-Pro
| # | Model | Provider | Score | |
|---|---|---|---|---|
| 🥇 | GPT-5.2 | OpenAI | 72.0 | |
| 🥈 | Gemini 2.5 Pro Preview 06-05 | 68.0 | | |
| 🥉 | Claude Opus 4 | Anthropic | 64.0 | |
| 4 | GPT-4o | OpenAI | 51.0 | |