Tokenizer

Count and visualize tokens used by different LLM models. Input your text below to see how it's tokenized.

Text Input
Enter the text you want to tokenize
Model Selection
Tokenized Text
Each color represents a different token
TokensCharacters
056

Enter text above to see token highlighting

Note: A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).