Skip to content

Tokens — What They Mean

Tokens

TL;DR: Tokens are the units LLMs use to process text — roughly 3/4 of a word in English. They determine both what the model can handle in one request and how much you pay.

Simple Explanation

LLMs don’t read words like humans do. They break text into smaller pieces called tokens:

  • “Hello” = 1 token
  • “Artificial intelligence” = 2 tokens
  • “Antidisestablishmentarianism” = 6 tokens

Rule of thumb for English: 1 token ≈ 4 characters ≈ 0.75 words

So 1,000 tokens ≈ 750 words ≈ about 1.5 pages of text.

Why It Matters for Business

Tokens affect two critical things:

1. Context Window (What Fits)

Every model has a maximum token limit for input + output combined:

ModelContext WindowRoughly
GPT-4o128K tokens~96K words
Claude Opus 4.5200K tokens~150K words
Gemini 1.5 Pro1M tokens~750K words

If your prompt + expected response exceeds this, the request fails or gets truncated.

2. Pricing (What You Pay)

API pricing is per token (usually per million):

ExampleInput CostOutput Cost
GPT-4o$2.50/1M$10/1M
Claude Sonnet$3/1M$15/1M

Output tokens cost more because generating text is computationally harder than reading it.

Real-World Example

A customer support bot processes a conversation:

  • Customer message: 50 tokens
  • Chat history: 500 tokens
  • System prompt: 200 tokens
  • Total input: 750 tokens

The bot generates a 100-token response.

Cost calculation (at $3/$15 per million):

  • Input: 750 × $0.000003 = $0.00225
  • Output: 100 × $0.000015 = $0.0015
  • Total: $0.00375 per conversation

At 10,000 conversations/month = $37.50

Token Optimization Tips

  1. Shorter system prompts — Every token counts, especially at scale
  2. Summarize history — Don’t send full conversation; compress older context
  3. Choose the right model — Use cheaper models for simple tasks
  4. Cache when possible — Some providers offer prompt caching discounts

Common Misconceptions

  • Myth: Tokens are the same as words

  • Reality: Common words = 1 token; complex/rare words = multiple tokens

  • Myth: All languages tokenize equally

  • Reality: Non-English languages often use 2-3x more tokens for the same content

Tokenizer Differences

Different models tokenize differently:

  • OpenAI uses “tiktoken”
  • Anthropic uses their own tokenizer
  • The same text may be slightly different token counts across models

Use official tokenizer tools to estimate costs accurately.

Key Takeaways

  • Tokens ≈ 0.75 words in English
  • They limit context window AND determine cost
  • Output tokens cost more than input tokens
  • Non-English text uses more tokens
  • Optimize tokens at scale for significant savings

Sources

  • OpenAI Tokenizer documentation
  • Anthropic Claude pricing guides