The AI Pricing Race to Zero — Blog

In early 2024, GPT-4 cost $30 per million input tokens. Today, models of comparable quality cost under $1. That's a 97% price drop in two years. Here's what happened and where pricing is heading.

The Price Collapse Timeline

The great AI price war started in earnest in mid-2024 when several things happened simultaneously:

GPT-4o launched at $5/$15 — less than half the price of GPT-4 Turbo, with better performance.
DeepSeek V2 showed that efficiency wins — by using a Mixture of Experts architecture, they achieved GPT-4 class performance at 1/10th the cost.
Google made Gemini Flash nearly free — aggressively pricing their fastest model to gain market share.
Open-source models reached GPT-4 level — Llama 3.1 70B matched GPT-4 on many benchmarks, available for free.

Where Are We Now?

The current pricing landscape looks remarkably different from even a year ago:

Tier	Input/1M	Output/1M	Models
Ultra Premium	$15-20	$60-80	Claude Opus 4, o3-pro
Premium	$2-10	$8-40	Claude Sonnet 4, GPT-4o, o3
Value	$0.10-1	$0.40-4	GPT-4.1 Nano, Gemini Flash, DeepSeek V3
Free tier	$0	$0	Gemini Flash (free tier), open-source self-hosted

What This Means for You

If you're building with AI or choosing a model:

Don't overpay for tasks that cheap models handle well. Simple text generation, summarisation, and classification can often be done by $0.10/1M models.
Use premium models only where quality matters. Complex reasoning, creative writing, and coding tasks benefit from the best models.
Watch the open-source gap closing. Self-hosting is becoming cost-effective for high-volume use cases.
Expect prices to keep falling. Hardware improvements (Nvidia Blackwell, custom chips from Google/Amazon) will push costs lower.

Track AI Pricing in Real Time

Our comparison tables are updated automatically. See current pricing across all providers.

View LLM pricing comparison →