FinanceBench
domainOpen-ended financial analysis — 150 questions over 10-K and 10-Q filings
View paper / source8
Models Tested
82.0
Best Score
76.0
Average Score
0–100
Scale Range
0.8x
Weight
How It Works
Models are evaluated according to the benchmark's standardised protocol.
Why It Matters
This benchmark helps compare AI model capabilities in a standardised way.
Limitations
All benchmarks have limitations and should be considered alongside other evaluations.