Caveman: Why use many token when few token do trick
tosh
740 points
325 comments
April 05, 2026
Related Discussions
Found 5 related stories in 40.8ms across 3,663 title embeddings via pgvector HNSW
- Caveman Mode Save Token? brightball · 15 pts · April 04, 2026 · 63% similar
- Universal Claude.md – cut Claude output tokens killme2008 · 231 pts · March 31, 2026 · 48% similar
- Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP knowsuchagency · 144 pts · March 09, 2026 · 45% similar
- AI Tokens Are Mana herbertl · 11 pts · March 30, 2026 · 44% similar
- Show HN: Nit – I rebuilt Git in Zig to save AI agents 71% on tokens fielding · 20 pts · March 26, 2026 · 43% similar
Discussion Highlights (20 comments)
andai
No articles, no pleasantries, and no hedging. He has combined the best of Slavic and Germanic culture into one :)
ArekDymalski
While really useful now, I'm afraid that in the long run it might accelerate the language atrophy that is already happening. I still remember that people used to enter full questions in Google and write SMS with capital letters, commas and periods.
TeMPOraL
Oh boy. Someone didn't get the memo that for LLMs, tokens are units of thinking . I.e. whatever feat of computation needs to happen to produce results you seek, it needs to fit in the tokens the LLM produces. Being a finite system, there's only so much computation the LLM internal structure can do per token, so the more you force the model to be concise, the more difficult the task becomes for it - worst case, you can guarantee not to get a good answer because it requires more computation than possible with the tokens produced. I.e. by demanding the model to be concise, you're literally making it dumber. (Separating out "chain of thought" into "thinking mode" and removing user control over it definitely helped with this problem.)
andai
So it's a prompt to turn Jarvis into Hulk!
zahirbmirza
You can also make huge spelling mistakes and use incomplete words with llms they just sem to know better than any spl chk wht you mean. I use such speak to cut my time spent typing to them.
VadimPR
Wouldn't this affect quality of output negatively? Thanks to chain of thought, actually having the LLM be explicit in its output allows it to have more quality.
teekert
Idk I try talk like cavemen to claude. Claude seems answer less good. We have more misunderstandings. Feel like sometimes need more words in total to explain previous instructions. Also less context is more damage if typo. Who agrees? Could be just feeling I have. I often ad fluff. Feels like better result from LLM. Me think LLM also get less thinking and less info from own previous replies if talk like caveman.
bhwoo48
I was actually worried about high token costs while building my own project (infra bundle generator), and this gave me a good laugh + some solid ideas. 75% reduction is insane. Starred
ryanschaefer
Kinda ironic this description is so verbose. > Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman For the first part of this: couldn’t this just be a UserSubmitPrompt hook with regex against these? See additionalContext in the json output of a script: https://code.claude.com/docs/en/hooks#structured-json-output For the second, /caveman will always invoke the skill /caveman: https://code.claude.com/docs/en/skills
Hard_Space
Also see https://arxiv.org/pdf/2604.00025 ('Brevity Constraints Reverse Performance Hierarchies in Language Models' March 2026)
saidnooneever
LOL it actually reads how humans reply the name is too clever :'). Not sure how effective it will be to dirve down costs, but honestly it will make my day not to have to read through entire essays about some trivial solution. tldr; Claude skill, short output, ++good.
gozzoo
I think this could be very useful not when we talk to the agent, but when the agents talk back to us. Usually, they generate so much text that it becomes impossible to follow through. If we receive short, focused messages, the interaction will be much more efficient. This should be true for all conversational agents, not only coding agents.
virtualritz
This is the best thing since I asked Claude to address me in third person as "Your Eminence". But combining this with caveman? Gold!
bogtog
I'd be curious if there were some measurements of the final effects, since presumably models wont <think> in caveman speak nor code like that
stared
I would prefer to talk like Abathur ( https://www.youtube.com/watch?v=pw_GN3v-0Ls ). Same efficiency but smarter.
cadamsdotcom
Caveman need invent chalk and chart make argument backed by more than good feel.
rschiavone
This trick reminds me of "OpenAI charges by the minute, so speed up your audio" https://news.ycombinator.com/item?id=44376989
nayroclade
Cute idea, but you're never gonna blow your token budget on output. Input tokens are the bottleneck, because the agent's ingesting swathes of skills, directory trees, code files, tool outputs, etc. The output is generally a few hundred lines of code and a bit of natural language explanation.
doe88
> If caveman save you mass token, mass money — leave mass star. Mass fun. Starred.
vivid242
Great idea- if the person who made it is reading: Is this based on the board game „poetry for cavemen“? (Explain things using only single-syllable words, comes even with an inflatable log of wood for hitting each other!)