Measuring Claude 4.7's tokenizer costs
aray07
574 points
401 comments
April 17, 2026
Related Discussions
Found 5 related stories in 62.4ms across 4,861 title embeddings via pgvector HNSW
- Universal Claude.md – cut Claude output tokens killme2008 · 231 pts · March 31, 2026 · 65% similar
- No, it doesn't cost Anthropic $5k per Claude Code user jnord · 80 pts · March 09, 2026 · 56% similar
- Claude usage limits hitting faster than expected Austin_Conlon · 11 pts · March 31, 2026 · 54% similar
- Claudetop – htop for Claude Code sessions (see your AI spend in real-time) liorwn · 51 pts · March 14, 2026 · 53% similar
- Reallocating $100/Month Claude Code Spend to Zed and OpenRouter kisamoto · 319 pts · April 09, 2026 · 52% similar
Discussion Highlights (20 comments)
uberman
On actual code, I see what you see a 30% increase in tokens which is in-line with what they claim as well. I personally don't tend to feed technical documentation or random pros into llms. Given that Opus 4.6 and even Sonnet 4.6 are still valid options, for me the question is not "Does 4.7 cost more than claimed?" but "What capabilities does 4.7 give me that 4.6 did not?" Yesterday 4.6 was a great option and it is too soon for me to tell if 4.7 is a meaningful lift. If it is, then I can evaluate if the increased cost is justified.
dallen33
I'm still using Sonnet 4.6 with no issues.
iknowstuff
Interesting because I already felt like current models spit out too much garbage verbose code that a human would write in a far more terse, beautiful and grokable way
louiereederson
LLMs exist on a logaritmhic performance/cost frontier. It's not really clear whether Opus 4.5+ represent a level shift on this frontier or just inhabits place on that curve which delivers higher performance, but at rapidly diminishing returns to inference cost. To me, it is hard to reject this hypothesis today. The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs. Their gross margins in this past quarter will be an important data point on this. I think the tendency for graphs of model assessment to display the log of cost/tokens on the x axis (i.e. Artificial Analysis' site) has obscured this dynamic.
xd1936
And what about with Caveman[1]? 1. https://github.com/juliusbrussee/caveman
atonse
Just yesterday I was happy to have gotten my weekly limit reset [1]. And although I've been doing a lot of mockup work (so a lot of HTML getting written), I think the 1M token stuff is absolutely eating up tokens like CRAZY. I'm already at 27% of my weekly limit in ONE DAY. https://news.ycombinator.com/item?id=47799256
jmward01
Yeah. I just did a day with 4.7 and I won't be going back for a while. It is just too expensive. On top of the tokenization the thinking seems like it is eating a lot more too.
rafram
Pretty funny that this article was clearly written by Claude.
markrogersjr
4.7 one-shot rate is at least 20-30% higher for me
bcjdjsndon
Because those braniacs added 20-30% more system prompt
CodingJeebus
The fundamental problem with these frontier model companies is that they're incentivized to create models that burn through more tokens, full stop. It's a tale as old as capitalism: you wake up every day and choose to deliver more value to your customers or your shareholders, you cannot do both simultaneously forever. People love to throw around "this is the dumbest AI will ever be", but the corollary to that is "this is the most aligned the incentives between model providers and customers will ever be" because we're all just burning VC money for now.
stefan_
I don't know anything about tokens. Anthropic says Pro has "more usage*", Max has 5x or 20x "more usage*" than Pro. The link to "usage limits" says "determines how many messages you can send". Clearly no one is getting billed for tokens.
_pdp_
IMHO there is a point where incremental model quality will hit diminishing returns. It is like comparing an 8K display to a 16K display because at normal viewing distance, the difference is imperceptible, but 16K comes at significant premium. The same applies to intelligence. Sure, some users might register a meaningful bump, but if 99% can't tell the difference in their day-to-day work, does it matter? A 20-30% cost increase needs to deliver a proportional leap in perceivable value.
mikert89
The compute is expensive, what is with this outrage? People just want free tools forever?
sipsi
I tried to do my usual test (similar to pelican but a bit more complex) but it ran out of 5 hour limit in 5 minutes. Then after 5 hours I said "go on" and the results were the worst I've ever seen.
qq66
This is the backdoor way of raising prices... just inflate the token pricing. It's like ice cream companies shrinking the box instead of raising the price
Yukonv
Some broad assumptions are being made that plans give you a precise equivalent to API cost. This is not the case with reverse engineering plan usage showing cached input is free [0]. If you re-run the math removing cached input the usage cost is ~5-34% more. Was the token plan budget increase [1] proportional to account for this? Can’t say with certainty. Those paying API costs though the price hike is real. [0] https://she-llac.com/claude-limits [1] https://xcancel.com/bcherny/status/2044839936235553167
encoderer
In my “repo os” we have an adversarial agent harness running gpt5.4 for plan and implementation and opus4.6 for review. This was the clear winner in the bake-off when 5.4 came out a couple months ago. Re-ran the bake-off with 4.7 authoring and… gpt5.4 still clearly winning. Same skills, same prompts, same agents.md.
lacoolj
This is probably an adjacent result of this (from anthropic launch post): > In Claude Code, we’ve raised the default effort level to xhigh for all plans. Try changing your effort level and see what results you get
curioussquirrel
Claude's tokenizers have actually been getting less efficient over the years (I think we're at the third iteration at the least since Sonnet 3.5). And if you prompt the LLM in a language other than English, or if your users prompt it or generate content in other languages, the costs go higher even more. And I mean hundreds of percent more for languages with complex scripts like Tamil or Japanese. If you're interested in the research we did comparing tokenizers of several SOTA models in multiple languages, just hit me up.