LLM Token Counter & API Cost Calculator

Paste your prompt, estimate tokens, and compare API costs across 18 models from OpenAI, Anthropic, Google, DeepSeek, and Meta.

Pricing verified as of March 2026 from official provider pages.

Input Text

0 characters0 words

Characters

Words

Est. Tokens

Chars / Token

Expected Output Tokens

Output tokens set to match input estimate (0 tokens).

Reasoning Effort

Adjusts the thinking-token multiplier for reasoning models (o3, o4-mini, DeepSeek Reasoner).

Cost Comparison

Model	Provider	Context	Input Cost	Output Cost	Total Cost
GPT-5.41M experimental	OpenAI	272K	$0.00	$0.00	$0.00
GPT-5.4 Mini	OpenAI	128K	$0.00	$0.00	$0.00
GPT-5	OpenAI	128K	$0.00	$0.00	$0.00
GPT-5 Mini	OpenAI	128K	$0.00	$0.00	$0.00
GPT-4.1	OpenAI	1.0M	$0.00	$0.00	$0.00
GPT-4.1 Mini	OpenAI	1.0M	$0.00	$0.00	$0.00
GPT-4.1 Nano	OpenAI	1.0M	$0.00	$0.00	$0.00
GPT-4o	OpenAI	128K	$0.00	$0.00	$0.00
GPT-4o Mini	OpenAI	128K	$0.00	$0.00	$0.00
o4-miniReasoning	OpenAI	200K	$0.00	$0.00	$0.00
o3Reasoning	OpenAI	200K	$0.00	$0.00	$0.00
o3-miniReasoning	OpenAI	200K	$0.00	$0.00	$0.00
Claude Opus 4.6	Anthropic	200K	$0.00	$0.00	$0.00
Claude Sonnet 4.6	Anthropic	200K	$0.00	$0.00	$0.00
Claude Haiku 4.5	Anthropic	200K	$0.00	$0.00	$0.00
Gemini 3.1 ProPreview	Google	1M	$0.00	$0.00	$0.00
Gemini 3 Flash	Google	1M	$0.00	$0.00	$0.00
Gemini 3.1 Flash-LitePreview	Google	1M	$0.00	$0.00	$0.00
Gemini 2.5 ProThinking	Google	1.0M	$0.00	$0.00	$0.00
Gemini 2.5 FlashThinking	Google	1.0M	$0.00	$0.00	$0.00
Gemini 2.5 Flash-Lite	Google	1.0M	$0.00	$0.00	$0.00
DeepSeek V4Flagship	DeepSeek	164K	$0.00	$0.00	$0.00
DeepSeek V3.2	DeepSeek	164K	$0.00	$0.00	$0.00
DeepSeek ReasonerReasoning	DeepSeek	164K	$0.00	$0.00	$0.00
Llama 4 MaverickOpen source	Meta (Groq)	1M	$0.00	$0.00	$0.00
Llama 4 ScoutOpen source, 10M ctx	Meta (Groq)	10M	$0.00	$0.00	$0.00

Token estimates use content-aware heuristics: ~4 chars/token for English prose, ~3.5 for code, and ~1.5 tokens per character for CJK/Arabic/Hebrew text. Reasoning models (o3, o4-mini, Gemini 2.5 Pro/Flash, DeepSeek Reasoner) include a thinking-token multiplier adjustable via the Reasoning Effort toggle (Low 1.5×, Medium 3×, High 5×). Actual counts vary by tokenizer. Pricing as of April 2026.

How Token Counting Works

Tokens are not words. LLMs process text as tokens, which are subword units that may be full words, word fragments, or individual characters.

On average, 1 token ≈ 4 characters or ≈ 0.75 words in English. Code uses ~3.5 chars/token. CJK/Arabic text uses ~1.5 tokens per character.

Different tokenizers: OpenAI uses tiktoken (BPE), Anthropic and Google use their own proprietary tokenizers. Token counts can vary ~10-15% between models for the same text.

Quick estimates:

• "Hello world" → ~2 tokens

• 1 paragraph → ~50-100 tokens

• 1 page of text → ~300-400 tokens

• 1,000 words → ~1,300 tokens

Tips for Reducing Costs

1. Be concise: remove filler words and redundancy from prompts.

2. Use system prompts wisely: they count as input tokens on every API call.

3. Set max_tokens: cap output length in API calls to control costs.

4. Use cheaper models for simple tasks (GPT-4o Mini, Haiku, Flash).

5. Batch API: most providers offer 50% savings for async processing.

6. Prompt caching: reuse cached prompts for 50-90% input savings.

Price per 1M Tokens

CheapestGPT-4.1 Nano: $0.10 / $0.40

Best ValueDeepSeek V3.2: $0.28 / $0.42

PremiumClaude Opus 4.6: $5 / $25

Learn more

Tokenizer Playground - visualize how BPE tokenization works

BPE Tokenization Trap - why "123" becomes 3 tokens

MCP Protocol Guide - connect LLMs to tools and APIs

Pricing last updated: March 2026. Prices may change - always verify with official provider pricing pages before production use.

Frequently Asked Questions

How are LLM tokens counted?

LLM providers use tokenizers (like tiktoken for OpenAI) that split text into subword units called tokens. On average, 1 token is about 4 characters or 0.75 words for English text. Code and non-Latin scripts (CJK, Arabic, Hebrew) use more tokens per character. This tool uses content-aware heuristics to estimate token counts.

Why do reasoning models cost more?

Reasoning models like o3, o4-mini, and DeepSeek Reasoner generate internal chain-of-thought tokens before producing the final answer. These thinking tokens are billed as output tokens, significantly increasing the effective cost. The Reasoning Effort toggle (Low 1.5x, Medium 3x, High 5x) lets you estimate this overhead.

What is a context window in LLMs?

A context window is the maximum number of tokens an LLM can process in a single request, including both input and output tokens. For example, GPT-4o has a 128K token context window, while Claude Sonnet 4.6 supports 200K tokens. Exceeding the context window causes the request to fail.

How do I reduce LLM API costs?

To reduce costs: (1) Use smaller, cheaper models for simple tasks, (2) Minimize prompt length by removing unnecessary context, (3) Set lower max output tokens, (4) Cache frequent responses, (5) Use batch APIs for non-real-time workloads, (6) Consider open-source models like Llama for high-volume use cases.

Related Tools

Tokenizer Playground

BPE, WordPiece tokenization

TF-IDF Calculator

Term frequency analysis

JSON Formatter

Format and validate JSON