A fast AI token counter and cost estimator for the major LLM APIs — GPT-4o, GPT-4 Turbo, GPT-3.5, Claude Opus 4, Claude Sonnet 4, and Claude Haiku 4. Paste or type into the textarea and see live estimates of token count, character count, and word count, plus a per-million-token cost breakdown for input, output, and total spend. Pick a model, optionally specify expected output tokens, and get an instant budget for any prompt. Ideal for sizing up long prompts before you send them and for estimating batch-job costs without having to run them first.
Last updated: March 2026tiktoken library or Anthropic's count_tokens endpoint. This tool uses the industry-standard 4-characters-per-token heuristic for English text.o200k_base, GPT-3.5 and GPT-4 Turbo use cl100k_base, and Claude uses Anthropic's proprietary BPE tokenizer. Expect accuracy within 10–15% for typical English text. Code, JSON, non-English languages, unusual punctuation, and long unique identifiers can all produce counts that differ by 30% or more from the estimate.tiktoken Python library (pip install tiktoken) and call encoding_for_model("gpt-4o") to get the exact tokenizer. For Anthropic, use the count_tokens endpoint on the Messages API which returns exact token counts without consuming quota. For JavaScript/Node, gpt-tokenizer and @anthropic-ai/tokenizer provide client-side equivalents. Run these in your prompt pipeline before calling the API if precise counting matters for your budget.cl100k_base (100,277 tokens), GPT-4o uses the newer o200k_base (200,019 tokens) which is roughly 20% more efficient on many non-English languages, and Claude uses its own BPE variant. The same sentence can produce different token counts across providers, which affects both cost (billed per token) and context window usage (tokens consumed against the model's max context).max_tokens, requesting terse responses, and caching long system prompts have an outsized effect on cost.