LLM Pricing Calculator - Compare 50+ Models

Compare costs across 50+ AI models side by side. Calculate pricing for GPT, Claude, Gemini, Llama, and more. Free cost estimator.

Usage Parameters

Input tokens per call

Output tokens per call

Calls per day

Model ↕	Provider ↕	Input $/1M ↕	Output $/1M ↕	Context ↕	Cost/Call ↑	Daily ↕	Monthly ↕
Llama 3.1 8BCheapest	Meta	$0.10	$0.10	128K	$0.000200	$0.0200	$0.600
Gemini 2.0 Flash	Google	$0.07	$0.30	1M	$0.000375	$0.0375	$1.13
Gemini 1.5 Flash	Google	$0.07	$0.30	1M	$0.000375	$0.0375	$1.13
GPT-4o mini	OpenAI	$0.15	$0.60	128K	$0.000750	$0.0750	$2.25
Command R	Cohere	$0.15	$0.60	128K	$0.000750	$0.0750	$2.25
Mistral Small	Mistral	$0.20	$0.60	32K	$0.000800	$0.0800	$2.40
Llama 3.1 70B	Meta	$0.90	$0.90	128K	$0.00180	$0.180	$5.40
GPT-3.5 Turbo	OpenAI	$0.50	$1.50	16.4K	$0.00200	$0.200	$6.00
Claude Haiku 3.5	Anthropic	$0.80	$4.00	200K	$0.00480	$0.480	$14.40
Llama 3.1 405B	Meta	$3.00	$3.00	128K	$0.00600	$0.600	$18.00
Gemini 1.5 Pro	Google	$1.25	$5.00	2M	$0.00625	$0.625	$18.75
Mistral Large	Mistral	$2.00	$6.00	128K	$0.00800	$0.800	$24.00
Mistral Medium	Mistral	$2.70	$8.10	32K	$0.0108	$1.08	$32.40
GPT-4o	OpenAI	$2.50	$10.00	128K	$0.0125	$1.25	$37.50
Command R+	Cohere	$2.50	$10.00	128K	$0.0125	$1.25	$37.50
o1-mini	OpenAI	$3.00	$12.00	128K	$0.0150	$1.50	$45.00
Claude Sonnet 4.5	Anthropic	$3.00	$15.00	200K	$0.0180	$1.80	$54.00
GPT-4 Turbo	OpenAI	$10.00	$30.00	128K	$0.0400	$4.00	$120.00
o1	OpenAI	$15.00	$60.00	200K	$0.0750	$7.50	$225.00
Claude Opus 4.6	Anthropic	$15.00	$75.00	200K	$0.0900	$9.00	$270.00

Llama 3.1 8BCheapest

LLM Pricing Calculator - Compare AI Model Costs

AI model pricing varies significantly across providers and models. This calculator helps you compare costs for input and output tokens across 50+ models from OpenAI, Anthropic, Google, Meta, and more.

Enter your expected token usage per request and your daily or monthly volume. The calculator shows per-request and monthly cost estimates for each model, making it easy to compare options side by side. Output tokens typically cost 2-4x more than input tokens, so the balance between prompt length and response length significantly affects total cost.

Model pricing spans several orders of magnitude. GPT-4o Mini costs roughly $0.15 per million input tokens, while GPT-4o costs $2.50 and Claude Opus costs $15 per million. For many tasks - classification, extraction, summarization - the cheaper models perform comparably, making them the better economic choice. Reserve expensive models for tasks that genuinely require their capabilities.

Batch APIs, prompt caching, and fine-tuning can further reduce costs. OpenAI's Batch API offers 50% discounts for non-time-sensitive requests. Anthropic's prompt caching reduces repeated system prompt costs by up to 90%. Fine-tuned smaller models can match larger models on specific tasks at a fraction of the cost.

Pricing data in this calculator is updated regularly but may lag behind the latest announcements. Always check the official provider pricing page before committing to large-scale usage. For measuring how many tokens your specific text uses, try our AI Token Counter.

How the LLM Pricing Calculator Works

01Select an AI model from the list (GPT-4o, Claude Opus, Gemini, etc.)
02Enter expected input and output token counts
03Set your estimated daily or monthly request volume
04See the total cost breakdown per request and per month

Optimizing LLM API Costs

LLM API pricing typically charges separately for input and output tokens, with output tokens costing 2-4x more. To reduce costs: use shorter system prompts, cache common prefixes where the API supports it, and choose the smallest model that meets your quality requirements. GPT-4o Mini and Claude Haiku are 10-20x cheaper than their flagship counterparts and sufficient for many tasks like classification, extraction, and simple generation.

When to Use the LLM Pricing Calculator

Use this calculator when choosing between AI models for a project, estimating monthly API costs for budgeting, comparing the cost-effectiveness of different models for specific tasks, evaluating whether to use a cheaper model for high-volume tasks, or presenting cost projections to stakeholders who need to approve AI spending.

Common Use Cases

Compare per-request costs across different AI models and providers AI Token Counter - GPT, Claude & Gemini
Estimate monthly API spending for a specific use case and volume
Evaluate cost-effectiveness of smaller vs. larger models for your task
Create budget projections for AI features in a product or service
Determine break-even points for fine-tuning vs. using a larger general model

Expert Tips

Start with the cheapest model that could work for your task. Only upgrade to a more expensive model if quality testing shows the cheaper option is insufficient.
Factor in the input/output ratio for your specific use case. A classification task (long input, short output) has different economics than a content generation task (short input, long output).
Prompt caching can reduce costs by up to 90% for applications that use the same system prompt across requests - check if your provider supports it.
For high-volume applications, the price difference between models compounds dramatically. A $0.15/M model vs. a $15/M model means the difference between $150/month and $15,000/month at 1 billion tokens.

Frequently Asked Questions

Why are output tokens more expensive than input tokens?→

Output tokens require more computation than input tokens. During generation, the model must run a forward pass for every single output token, while input tokens can be processed in parallel. This computational asymmetry is reflected in the pricing - output tokens typically cost 2-4x more than input tokens.

What are the cheapest AI models available?→

As of 2025-2026, GPT-4o Mini ($0.15/M input), Claude Haiku ($0.25/M input), and Gemini Flash ($0.075/M input) are among the most affordable. These models handle many tasks (classification, extraction, simple generation) at comparable quality to larger models at a fraction of the cost.

How can I reduce my AI API costs?→

Use the smallest model that meets your quality requirements. Enable prompt caching where available (saves up to 90% on repeated system prompts). Use batch APIs for non-time-sensitive requests (50% discount on OpenAI). Optimize prompts to be concise - shorter prompts mean fewer input tokens.

Are the prices always current?→

Pricing data is updated regularly but may lag behind the latest provider announcements. AI model pricing changes frequently - new model releases, promotional pricing, and tier adjustments happen multiple times per year. Always verify with the official provider pricing page before committing to large-scale usage.

Related tools

12 suggested