MMMU-Pro

multimodal

MMMU-Pro is a harder version of MMMU that eliminates shortcut strategies. It tests multimodal understanding across 30 subjects and 6 disciplines with 30 different image types including charts, diagrams, and chemical structures.

Models Tested

72.0

Best Score

63.8

Average Score

0–100

Scale Range

1.2x

Weight

How It Works

Models answer questions about images that require genuine understanding — not just pattern matching. The "Pro" version removes answer choices that could be eliminated without understanding the image, making it a purer test.

Why It Matters

As models get better at MMMU through shortcuts, MMMU-Pro ensures scores reflect genuine multimodal understanding. It's the gold standard for testing whether AI truly comprehends visual academic content.

Limitations

Requires multimodal input so text-only models cannot be tested. Newer benchmark with less historical data. Some image types (e.g. music sheets) may be underrepresented in training data.

Leaderboard — MMMU-Pro

#	Model	Provider	Score	Source	Measured
🥇	GPT-5.2	OpenAI	72.0	OpenAI	Dec 2025
🥈	Gemini 2.5 Pro Preview 06-05	Google	68.0	Google	Mar 2025
🥉	Claude Opus 4	Anthropic	64.0	Anthropic	May 2025
4	GPT-4o	OpenAI	51.0	OpenAI	May 2024

All Benchmarks