Anonymous request-token comparisons from Opus 4.6 and Opus 4.7

anabranch 479 points 481 comments April 18, 2026
tokens.billchambers.me · View on Hacker News

Discussion Highlights (20 comments)

anabranch

I wanted to better understand the potential impact for the tokenizer change from 4.6 and 4.7. I'm surprised that it's 45%. Might go down (?) with longer context answers but still surprising. It can be more than 2x for small prompts.

someuser54541

Should the title here be 4.6 to 4.7 instead of the other way around?

therobots927

Wow this is pretty spectacular. And with the losses anthro and OAI are running, don’t expect this trend to change. You will get incremental output improvements for a dramatically more expensive subscription plan.

justindotdev

i think it is quite clear that staying with opus 4.6 is the way to go, on top of the inflation, 4.7 is quite... dumb. i think they have lobotomized this model while they were prioritizing cybersecurity and blocking people from performing potentially harmful security related tasks.

coldtea

This, the push towards per-token API charging, and the rest are just a sign of things to come when they finally establish a moat and full monoply/duopoly, which is also what all the specialized tools like Designer and integrations are about. It's going to be a very expensive game, and the masses will be left with subpar local versions. It would be like if we reversed the democratization of compilers and coding tooling, done in the 90s and 00s, and the polished more capable tools are again all proprietary.

ai_slop_hater

Does anyone know what changed in the tokenizer? Does it output multiple tokens for things that were previously one token?

ausbah

is it really unthinkable that another oss/local model will be released by deepseek, alibaba, or even meta that once again give these companies a run for their money

kalkin

AFAICT this uses a token-counting API so that it counts how many tokens are in the prompt, in two ways, so it's measuring the tokenizer change in isolation. Smarter models also sometimes produce shorter outputs and therefore fewer output tokens. That doesn't mean Opus 4.7 necessarily nets out cheaper, it might still be more expensive, but this comparison isn't really very useful.

fny

I'm going to suggest what's going on here is Hanlon's Razor for models: "Never attribute to malice that which is adequately explained by a model's stupidity." In my opinion, we've reached some ceiling where more tokens lead only to incremental improvements. A conspiracy seems unlikely given all providers are still competing for customers and a 50% token drives infra costs up dramatically too.

tailscaler2026

Subsidies don't last forever.

Shailendra_S

45% is brutal if you're building on top of these models as a bootstrapped founder. The unit economics just don't work anymore at that price point for most indie products. What I've been doing is running a dual-model setup — use the cheaper/faster model for the heavy lifting where quality variance doesn't matter much, and only route to the expensive one when the output is customer-facing and quality is non-negotiable. Cuts costs significantly without the user noticing any difference. The real risk is that pricing like this pushes smaller builders toward open models or Chinese labs like Qwen, which I suspect isn't what Anthropic wants long term.

dakiol

We dropped Claude. It's pretty clear this is a race to the bottom, and we don't want a hard dependency on another multi-billion dollar company just to write software We'll be keeping an eye on open models (of which we already make good use of). I think that's the way forward. Actually it would be great if everybody would put more focus on open models, perhaps we can come up with something like the "linux/postgres/git/http/etc" of the LLMs: something we all can benefit from while it not being monopolized by a single billionarie company. Wouldn't it be nice if we don't need to pay for tokens? Paying for infra (servers, electricity) is already expensive enough

ben8bit

Makes me think the model could actually not even be smarter necessarily, just more token dependent.

l5870uoo9y

My impression the reverse is true when upgrading to GPT-5.4 from GPT-5; it uses fewer tokens(?).

micromacrofoot

The latest qwen actually performs a little better for some tasks, in my experience latest claude still fails the car wash test

tiffanyh

I was using Opus 4.7 just yesterday to help implement best practices on a single page website. After just ~4 prompts I blew past my daily limit. Another ~7 more prompts & I blew past my weekly limit. The entire HTMl/CSS/JS was less than 300 lines of code. I was shocked how fast it exhausted my usage limits.

mvkel

The cope is real with this model. Needing an instruction manual to learn how to prompt it "properly" is a glaring regression. The whole magic of (pre-nerfed) 4.6 was how it magically seemed to understand what I wanted, regardless of how perfectly I articulated it. Now, Anth says that needing to explicitly define instructions are as a "feature"?!

blahblaher

Conspiracy time: they released a new version just so hey could increase the price so that people wouldn't complain so much along the lines of "see this is a new version model, so we NEED to increase the price") similar to how SaaS companies tack on some shit to the product so that they can increase prices

templar_snow

Brutal. I've been noticing that 4.7 eats my Max Subscription like crazy even when I do my best to juggle tasks (or tell 4.7 to use subagents with) Sonnet 4.6 Medium and Haiku. Would love to know if anybody's found ideal token-saving approaches.

dackdel

releases 4.8 and deletes everything else. and now 4.8 costs 500% more than 4.7. i wonder what it would take for people to start using kimi or qwen or other such.

Semantic search powered by Rivestack pgvector
4,930 stories · 46,452 chunks indexed