All terms · Core Concepts

Token

The smallest unit of text an AI processes—usually a word fragment, character, or subword.

A token is how language models break down text into digestible pieces before processing. A word like "running" might be tokenized as "run" + "ning", while simple words like "cat" stay whole. Token boundaries vary by model, but roughly 1 token ≈ 4 English characters, or 0.75 words. Understanding tokens is critical because most AI pricing is measured in tokens (e.g., "100k tokens per day"), and token limits constrain how much text you can process at once.

When an AI processes text, it converts every piece into tokens, assigns each a number, runs them through the model, and converts the output tokens back to human text. The more tokens in your input, the slower and more expensive the response.

Example

The sentence "Hello, world!" might tokenize as ["Hello", ",", "world", "!"]—4 tokens.