Anthropic says Alibaba illicitly extracted Claude AI model capabilities
htrp
265 points
450 comments
June 24, 2026
Related Discussions
Found 5 related stories in 141.4ms across 11,536 title embeddings via pgvector HNSW
- Anthropic Accuses Alibaba of ‘Illicitly’ Accessing AI Models ryanmerket · 16 pts · June 25, 2026 · 82% similar
- A leak reveals that Anthropic is testing a more capable AI model "Claude Mythos" Tiberium · 11 pts · March 27, 2026 · 64% similar
- Anthropic Races to Contain Leak of Code Behind Claude AI Agent sonabinu · 21 pts · April 01, 2026 · 63% similar
- Amazon CEO's talks with U.S. officials triggered crackdown on Anthropic models ls612 · 626 pts · June 13, 2026 · 62% similar
- Mystery company accidentally blew $500M on Claude AI in a single month yogthos · 11 pts · May 29, 2026 · 59% similar
Discussion Highlights (20 comments)
zakkl
It sounds like Anthropic is eagerly trying to show to USG that they are willing to heavily monitor ‘foreign adversaries’ on their platforms. This combined with no implementation of KYC makes it seem like they want to find a middle ground with Fable where its off of export controls but they promise to prevent China and specific others from using.
drillsteps5
I'm looking forward to the trial where Anthropic will have to disclose sources of their training data, and then explain why they are entitled to charging customers for using regurgitated training data but Alibaba which trains their models on Anthropic's models are not. Should be fun. Edit: clarification
rvz
Notice how Anthropic is now scapegoating Chinese models providers like Alibaba and outright accusing them of distilling their models. Whether if it is true or not, this is part of their effort into using them as an example to scare everyone into getting congress to ban powerful models from being accessed outside of the US and also banning powerful local models from being released. Anthropic does not care about you, and they are not your friends.
zb3
If true then Alibaba is doing us a public service, good job, I hope this extraction was successful.
0xbadcafebee
There's two basic kinds of distillation: 1) the massive [and dumb] method where you ask a question and use the answer as reinforcement (Black Box), and 2) more targeted distillation where you use one model to directly inform/train/guide another model (RLAIF). The latter is basically fine-tuning the model with direction from another model. Thousands of businesses do this every day to fine-tune. This is almost certainly what the Chinese labs are doing, since it has a much better effect on the end result than just getting simple answers to simple questions. These complaints of distillation are inflating the problem to make it sound worse than it is, because they want the USG to block/ban Chinese model providers as protectionism. They have already called for more export controls on chips (which is funny because DeepSeek v4 was designed to run on Huawei chips and now the other Chinese providers are following suit). But they can't come right out and say that, so their claim is that they're asking for more export controls because distilled models might not be as safe as their own. But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.
Pxtl
"You're trying to kidnap what I've rightfully stolen!"
walrus01
Reminds me a bit of the anecdote of Steve Jobs complaining about people ripping off the Mac GUI, in the mid to late 1980s, when he gave no public acknowledgement to the work done by Xerox on the Alto and Star operating system. "you're trying to rip off what I've already ripped off!" Crawl the whole Internet to build a gargantuan sized LLM and then complain you're being copied...
gaiagraphia
A company which got rich on extracting the world's content is complaining that another company has extracted their work?! LOL! Get a grip, son.
amazingamazing
Distillation is fundamentally impossible to protect against. All you can do is slow them down. Change my view. Eventually these Chinese companies will release some extension like Honey, which will sit on top real, non-Chinese clients and send everything to China anyway. It's over.
randomboy3423
A partly insider on this. I think Anthropic is just marketing / bluffing, because they don't even have the data. They do distill the models, but they don't go to Anthropic, they just use platforms like aws bedrock, there are too many restrictions on Anthropic's own platform.
youknownothing
laughs in ironic
tristanj
Here's what is happening: Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices. They achieve this by reselling capacity from pooled Claude Max accounts, payments fraud, and also reselling the model output & reasoning chains to various Chinese labs. They are subsidizing model access in exchange for user logs and reasoning traces, which they then sell as training data, allowing them to operate below cost. Claude and ChatGPT are both blocked in China. You need to use a VPN to access either, and you can't pay with a Chinese bank card. So most people who want access to Claude buy access via a reseller. It's the easiest and cheapest way to access Anthropic models in China. These resellers operate tens of thousands of bot accounts, which is also why Anthropic introduced identity verification, to slow down the onslaught of bots. Here's one token reseller, they're offering Opus 4.8 at a 93% discount below official API rates: https://yunwu.ai/pricing?provider=Anthropic This is one reason why DeepSeek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them. I shared this story a few months back, but it never got any traction. It explains the token resale economy in China, it's an excellent read https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens...
andai
We have Claude at home!
ProAm
Says the company that is involved in the largest copyright heists of all time to build it's product.
BigTTYGothGF
If you're an AI booster surely you'd think this was a good thing as it means more models are available in more places to more people more easily. I'm exactly the opposite, and I think this is a good thing because I want Anthropic to suffer.
tonyoconnell
The narrative is moving towards KYC
jrflowers
I like that they use “illicit” and “fraudulent” like as if model distillation is illegal and giving them money and then doing whatever they want with the output of their publicly accessible models (which Anthropic does not own) is… also illegal? “Anthropic, red faced after unattended ice cream cone eaten by ants on park bench, once again demands government pick it as forever winner, adds ‘no take backsies’”
thadk
Does anyone have hints on what kinds of prompts are most used for a distillation like this—SWE-Bench sorts of things? Is reconstructing the compressed knowledge in the model like reconstructing a lossy JPG or MP3 a reasonable analogy?
awkwabear
Wait so they're upset that people used their IP to train a model without their consent or paying them anything? or is this just about the token reselling?
paxys
Repeatedly warn everyone that your models are so good they will wreck cybersecurity. Complain/brag that chinese firms are illegally using the models and bypassing export controls. Be surprised when your model gets banned by the government.