AI companies charge you 60% more based on your language, BPE tokens
vfalbor
23 points
19 comments
April 01, 2026
Related Discussions
Found 5 related stories in 50.0ms across 3,471 title embeddings via pgvector HNSW
- You are going to get priced out of the best AI coding tools (2025) fi-le · 76 pts · March 03, 2026 · 51% similar
- Top AI models underperform in languages other than English Brajeshwar · 19 pts · March 19, 2026 · 46% similar
- Lower Price for ChatGPT Business alxthm · 16 pts · April 03, 2026 · 45% similar
- Nvidia's Huang pitches AI tokens on top of salary wmat · 19 pts · March 20, 2026 · 44% similar
- The AI Marketing BS Index speckx · 96 pts · April 01, 2026 · 44% similar
Discussion Highlights (10 comments)
vfalbor
The Biggest Con of the 21st Century: Tokens How AI Companies Are Charging You More Without You Even Realizing It You pay for what you use. That's the deal. Except it's not. When you use an AI model — GPT-4, Claude, Gemini — you do not pay per word. You pay per token. And that tiny technical detail is quietly costing you, depending on which company you choose, up to 60% more for the exact same request.
Mindless2112
Funny they didn't include any CJK languages on their list.
simianwords
This has to be one of the worst things I have read. If this is not satire idk what counts
lxgr
“Pay by token” is priced by token, not word or semantic unit; news at 11? The product itself seems genuinely useful, but the article reads very sensationalist about something that should be pretty obvious. In other news: French publishers are paying 30% more for paper than English publishers!!
charcircuit
The companies didn't arbitrarily choose to bill by tokens. The cost to serve the models scales linearly with tokens which makes it a reasonable pricing strategy. The reality is that you are charged more because it was more expensive to handle the request.
Animats
It's an ad. "The Solution: TokensTree". From tokenstree.com I was expecting a secondary market in tokens, perhaps crypto-powered, but no. The cost difference for languages roughly correlates with how much text it takes to say something in that language. English is relatively terse. (This is a common annoyance when internationalizing dialog boxes. If sized for English, boxes need to be expanded.) They don't list any of the ideographic languages, which would be interesting.
simianwords
Europeans be like: AI commits a racism. AI commits an environmentalism. Now use my product (that won't solve either)
simonw
The title of this piece differs from the HN title, but the HN title is a lot better. The original title is "The Biggest Con of the 21st Century: Tokens", subhead "How AI Companies Are Charging You More Without You Even Realizing It" - which is an absurd title because tokens are NOT the "biggest con" of anything, and AI companies make it very clear exactly how their pricing works. I also don't like how this article presents numbers for language differences - in the "The Language Tax" section - but fails to clarify which tokenizer and where those numbers came from.
aprentic
There's certainly an interesting question here, even if Tokenstree doesn't provide a solution or even define the problem well. The broader questions are still interesting. If an AI is trained more on language A than language B but has some training in translating B to A, what is the overhead of that translation? If the abilities are combined in the same model, how much lower is the overhead than doing it as separate operations? ie is f(a) < f(b) < f(t(B,A) ? where a and b are in A and B and f() and t() are the costs of processing a prompt and the cost of translating a prompt. Then there's the additional question of what happens with character based languages. It's not obvious how it would make sense to assign multiple tokens to a single character but there's the question of how much information in character based vs phonic based words and what the information content of sentences with either one is.
cyberge99
English Teachers: “Proper grammar is cost effective!”