LiveCodeBench

coding

LiveCodeBench evaluates coding ability on competitive programming problems sourced from live contests (LeetCode, Codeforces, AtCoder) that post-date model training cutoffs.

View paper / source

0

Models Tested

0.0

Average Score

0–100

Scale Range

1.2x

Weight

How It Works

Models solve algorithmic programming problems with exact test case verification. Problems are continuously updated from recent programming contests, ensuring they are truly novel for each model being tested.

Why It Matters

By using problems from recent contests, LiveCodeBench minimises data contamination — a major issue with older coding benchmarks. It provides a more honest assessment of a model's algorithmic reasoning ability.

Limitations

Competition programming is a specific skill that doesn't fully represent general software engineering ability. Continuous updates make historical comparisons tricky.

Leaderboard — LiveCodeBench

No model scores recorded yet for this benchmark.
All Benchmarks