Large Language Models (LLMs)
Compare pricing for top language models including GPT-4, Claude, Gemini, Mistral, Llama, and more.
All LLM Providers
| Vendor | Model | Input /1M | Output /1M | Context | Free Tier |
|---|
| OpenAI | GPT-4o | $2.50 | $10.00 | 128k | 100k/mo |
| OpenAI | GPT-4o-mini | $0.15 | $0.60 | 128k | 100k/mo |
| OpenAI | GPT-4 Turbo | $10.00 | $30.00 | 128k | 100k/mo |
| OpenAI | o1-preview | $15.00 | $60.00 | 128k | 100k/mo |
| OpenAI | o1-mini | $3.00 | $12.00 | 128k | 100k/mo |
| Anthropic | Claude 3.5 Sonnet | $3.00 | $15.00 | 200k | Limited |
| Anthropic | Claude 3.5 Haiku | $0.25 | $1.25 | 200k | Limited |
| Anthropic | Claude 3 Opus | $15.00 | $75.00 | 200k | Limited |
| Google | Gemini 1.5 Pro | $1.25 | $5.00 | 2M | 1M/mo |
| Google | Gemini 1.5 Flash | $0.075 | $0.30 | 1M | 1M/mo |
| Google | Gemini 2.0 Flash | $0.10 | $0.40 | 1M | 1M/mo |
| Meta | Llama 3.1 405B | $3.50 | $3.50 | 128k | Free (local) |
| Meta | Llama 3.1 70B | $0.65 | $2.75 | 128k | Free (local) |
| Meta | Llama 3.1 8B | $0.22 | $0.22 | 128k | Free (local) |
| Meta | Llama 3.2 90B Vision | $0.90 | $3.60 | 127k | Free (local) |
| Mistral | Mistral Large | $2.00 | $6.00 | 128k | 100k/mo |
| Mistral | Mistral Nemo | $0.15 | $0.15 | 128k | 100k/mo |
| Mistral | Mistral Small | $0.60 | $1.80 | 128k | 100k/mo |
| Mistral | Codestral | $0.20 | $0.70 | 32k | Free (beta) |
| Cohere | Command R+ | $3.00 | $15.00 | 128k | 10k/mo |
| Cohere | Command R | $0.50 | $1.50 | 128k | 10k/mo |
| Cohere | Command | $0.30 | $1.50 | 32k | 10k/mo |
| AWS Bedrock | Claude 3.5 Sonnet | $3.00 | $15.00 | 200k | Via AWS |
| AWS Bedrock | Llama 3.1 70B | $0.65 | $2.75 | 128k | Via AWS |
| Azure OpenAI | GPT-4o | $2.50 | $10.00 | 128k | $200 credit |
| Perplexity | Sonar Large | $3.00 | $15.00 | 128k | API pricing |
| Perplexity | Sonar Small | $0.20 | $0.70 | 128k | API pricing |
| xAI | Grok-2 | $2.00 | $10.00 | 131k | $15/mo |
| xAI | Grok-1.5 | $5.00 | $15.00 | 131k | API pricing |
Cheapest First:
- Google Gemini 1.5 Flash - $0.075/1M
- Mistral Nemo - $0.15/1M
- GPT-4o-mini - $0.15/1M
- Llama 3.1 8B - $0.22/1M
- Cohere Command - $0.30/1M
By Context Window
Longest Context:
- Google Gemini 1.5 Pro - 2M tokens
- Claude 3.5 Sonnet/Opus - 200k tokens
- GPT-4o/GPT-4 Turbo - 128k tokens
- Llama 3.1 models - 128k tokens
Reasoning Models
| Model | Input /1M | Output /1M | Notes |
|---|
| OpenAI o1-preview | $15.00 | $60.00 | Advanced reasoning |
| OpenAI o1-mini | $3.00 | $12.00 | Fast reasoning |
| Claude 3.5 Sonnet | $3.00 | $15.00 | Strong reasoning |
| Gemini 1.5 Pro | $1.25 | $5.00 | Good reasoning |
Open Source Models
| Model | Provider | Input /1M | Output /1M | Local Cost |
|---|
| Llama 3.1 405B | Meta | $3.50 | $3.50 | GPU dependent |
| Llama 3.1 70B | Meta | $0.65 | $2.75 | GPU dependent |
| Llama 3.1 8B | Meta | $0.22 | $0.22 | GPU dependent |
| Mistral Large | Mistral | $2.00 | $6.00 | GPU dependent |
| Mistral Nemo | Mistral | $0.15 | $0.15 | GPU dependent |
| Code Llama 70B | Meta | $0.65 | $2.75 | GPU dependent |
Best For
| Need | Recommended |
|---|
| Best overall | GPT-4o or Claude 3.5 Sonnet |
| Budget tasks | GPT-4o-mini or Gemini Flash |
| Long documents | Gemini 1.5 Pro |
| Complex reasoning | Claude 3.5 Sonnet/Opus |
| RAG applications | Command R+ or Gemini Flash |
| Open source | Llama 3.1 70B or Mistral Large |
| Code generation | Codestral or Claude 3.5 Sonnet |
| Research | Gemini 1.5 Pro or Claude 3.5 Opus |