What is a token in
claude?
In short tokens are the
smallest unit of a language model and corresponds roughly to words but not
precisely.
Some examples
·
“hello”
à usually 1 token
“Fort
Collins, Colorado” à ≈ 4–5 tokens
<100
English words> à ≈ 130 tokens
So its close, but not
one per one on a word.
Let me run a few tests.
Testing token counting
I found this from claude https://platform.claude.com/docs/en/build-with-claude/token-counting [1] https://platform.claude.com/docs/en/api/messages/count_tokens [2]
In this doc they give an example curl command to get how many tokens an input would be. (The tool is free to use but does have some rate limits) (Of course swap it out with your Key)
If I run this
|
> curl
https://api.anthropic.com/v1/messages/count_tokens \
|
I get back
{"input_tokens":10}
|
> curl
https://api.anthropic.com/v1/messages/count_tokens \
|
Let me convert this into
a script so I can do some more quick testing.
Token Count Script
|
> sudo vi /usr/bin/token-count |
And place the following in it.
|
#!/usr/bin/env
python3 import
sys def
get_content() -> str: def
main(): if not api_key: payload = { headers = { try: response.raise_for_status() tokens =
data.get("input_tokens") if tokens is not None: except requests.RequestException as e: if
__name__ == "__main__":
|
You can download it here https://gist.github.com/patmandenver/adc2378f7b1dee821c8e94e410a5e631
Chmod it
|
> chmod 755 /usr/bin/token-count |
Now try it
|
> token-count Hi
|
Ok why does hi have 8 tokens rather than 1?
Let me try some more fun
|
> echo "word_"{001..010} | token-count |
Passing in 10 words then
20 then 100
What about just passing
hi 10, 20, then 100 times
|
> yes Hi | head -10 | tr '\n' ' ' | sed 's/ $//'
| token-count |
Interesting the overhead
seems to be 7+ a token per word.
I guess it is thinking about the numbers in the prior one word_001 seems like 4
things?
OK let me throw some code at it.
Code Testing
I found this one https://github.com/TheAlgorithms/Python [3] that has a collection of simple Python programs that are good examples.
Let’s pull a simple test repo down from github to test it on.
|
> git clone
git@github.com:TheAlgorithms/Python.git |
Let’s first look at the
lower.py code.
Let me use the wc tool to count the number of words
|
> wc -w lower.py |
That gave me back 93…
Now let me feed it into claude (no questions just feeding it in)
|
> cat lower.py | token-count |
So double the word count.
Let me try another one
|
> wc -w min_cost_string_conversion.py |
Roughly 3x in token size
vs wc
Testing some real input/outputs
Now let me test some
real, costs money, tests.
let me hit the message endpoint https://platform.claude.com/docs/en/api/messages/create [4]
|
> sudo vi /usr/bin/claude-message |
And place the following in it
|
#!/usr/bin/env
python3 """ import
sys def
get_content() -> str: parser =
argparse.ArgumentParser(description="Generate response using Anthropic
API") if args.text: print("Error: No input
provided", file=sys.stderr) def
main(): api_key = os.environ.get("ANTHROPIC_API_KEY") or os.environ.get("ANTHROPIC_API_KEY_FIX") if not api_key: content = get_content() payload = { headers = { try: response.raise_for_status() # Extract the response content # Extract token usage print(f"Tokens In:
{input_tokens}") except requests.RequestException as e: if
__name__ == "__main__": |
You can download it here https://gist.github.com/patmandenver/2674e45fc50ce8f8feca09325775c2b1
Chmod it
|
> chmod 755 /usr/bin/claude-message |
Now test it
|
> claude-message "hi" |
8 in and 24 out.
And when I check my actual usage I see it increment these
exact numbers
Conclusions
Token count is complicated.
And I imagine it will change over time.
If you run the scripts I have in here today with the same data I think you will
get the same results. However, I do not
think that is guaranteed 6 months from now… I imagine in 6 months the Token
calculation will change and if you run these same scripts then you may get
different token counts.
References
[1] Token Counting
https://platform.claude.com/docs/en/build-with-claude/token-counting
Accessed 03/2026
[2] Count tokens in a Message
https://platform.claude.com/docs/en/api/messages/count_tokens
Accessed 03/2026
[3] TheAlgorithms / Python
https://github.com/TheAlgorithms/Python
Accessed 03/2026
[4] Create a Messsage
https://platform.claude.com/docs/en/api/messages/create
Accessed 03/2026
No comments:
Post a Comment