Tokenizer

Count and visualize tokens used by different LLM models. Input your text below to see how it's tokenized.

Text Input

Enter the text you want to tokenize

Model Selection

Tokenized Text

Each color represents a different token

TokensCharacters

056

Enter text above to see token highlighting

Note: A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).