MMMU-Pro

multimodal

MMMU-Pro is a harder version of MMMU that eliminates shortcut strategies. It tests multimodal understanding across 30 subjects and 6 disciplines with 30 different image types including charts, diagrams, and chemical structures.

4

Models Tested

72.0

Best Score

63.8

Average Score

0–100

Scale Range

1.2x

Weight

How It Works

Models answer questions about images that require genuine understanding — not just pattern matching. The "Pro" version removes answer choices that could be eliminated without understanding the image, making it a purer test.

Why It Matters

As models get better at MMMU through shortcuts, MMMU-Pro ensures scores reflect genuine multimodal understanding. It's the gold standard for testing whether AI truly comprehends visual academic content.

Limitations

Requires multimodal input so text-only models cannot be tested. Newer benchmark with less historical data. Some image types (e.g. music sheets) may be underrepresented in training data.

Leaderboard — MMMU-Pro

# Model Provider Score
🥇 GPT-5.2 OpenAI 72.0
🥈 Gemini 2.5 Pro Preview 06-05 Google 68.0
🥉 Claude Opus 4 Anthropic 64.0
4 GPT-4o OpenAI 51.0
All Benchmarks