Feds freaked over Fable 5 after 'fix this code', not jailbreak, say researchers
_tk_
560 points
332 comments
June 16, 2026
Related Discussions
Found 5 related stories in 111.1ms across 10,715 title embeddings via pgvector HNSW
- Fable ban was never about a jailbreak? amarant · 107 pts · June 16, 2026 · 68% similar
- Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable speckx · 349 pts · June 10, 2026 · 61% similar
- US ban on Mythos is related to a jailbreak research by Amazon researchers maxloh · 12 pts · June 13, 2026 · 59% similar
- Anthropic's new Fable model has been jailbroken Khaine · 13 pts · June 11, 2026 · 57% similar
- Statement on US government directive to suspend access to Fable 5 and Mythos 5 Dylan1312 · 1713 pts · June 13, 2026 · 54% similar
Discussion Highlights (20 comments)
ceejayoz
More likely, they didn't freak out at all. It was an excuse to fuck with them, just like the "supply chain risk" finding a few months back. (See, for example: https://x.com/PeteHegseth/status/2065897156226015690 )
spwa4
Well this makes it sound the feds were less worried about someone using Fable 5 to attack them , but were worried about someone using Fable 5 to prevent the Feds from attacking others ... As in worried about other countries/organizations using Fable 5 to actually do decent cyber security.
martinald
If you set aside political menace, this is a huge problem with Anthropic's strategy. You _cannot_ say that Mythos is super dangerous and can only be rolled out to certain people, but then release Fable with anything other than bulletproof cyber denials. Clearly with LLMs, bulletproof denials are ~impossible due to the way LLMs work. So you've ended up in a situation where Anthropic are simultaneously claiming it's a incredibly dangerous model _and_ there are (minor, potentially) problems with the security "protections". As technical people we understand that nothing can be perfect, esp in LLM world. But all my non technical friends were really confused how they had managed to make the model "safe" so quickly when it was released and the general sentiment was it shouldn't have been released - and now to an outsider I think it looks like it was never safe at all to release, so I can totally see how the current US administration have got themselves very upset with it. _Even if_ there was no political bad will, it's a bit of a silly scenario to end up in, and really quite easily foreseen.
rock_artist
I'm not sure I've understood it correctly. So, basically the model didn't agree to expose possible vulnerabilities but agree to patch those? Regardless of the request to take Fable 5 down. Why is requesting the model to show vulnerabilities is being blocked if fixing it not? is it based on the assumption of the intention? I don't quite get the benefit of limiting it. So if anyone can explain it better it'll be appreciated.
jpcompartir
They weren't freaked by anything, it's a retaliatory shakedown after ideological differences and Anthropic not doing exactly what they're told/what the Admin wants them to do.
dathinab
Lol "fix this code" is beautiful. Like it basically jail broke the "no security vul guard rails" not in any clever way but just by fixing them, producing exploit code just by writing test cases making sure it's fixed. So you just need to look at the code & tests as a human to get vulnerabilities and exploits(components). What makes this so beautiful IMHO is that it's a trivial jail break, but also a close to unfixable. At least not without making the model close to useless for normal development (it refuses to fix bugs/write code) or making it a major liability (it silently pretends it didn't see bugs and silently avoids fixing it, which for a human would count as intentional sabotage and might involve criminal liability).
aurareturn
Don't people get it by now? This administration will do or say something crazy to a private company, then this private company sends an envoy to the White House to negotiate, then the White House asks for 10% of the company or other concessions. The White House wants 10% of Anthropic. This is just a negotiation tactic that Trump keeps on using.
bonsai_spool
Here’s the blog post referenced in the article that’s written by the person who reviewed the paper that purportedly found a ‘jailbreak’ https://www.lutasecurity.com/post/the-fable-5-export-control...
iloveoof
Ahhh! Software engineering!
ZuLuuuuuu
Did they try other publicly available models on the same code with the same prompts before the ban? Was Fable the only one which was able to detect and fix the security vulnerabilities?
lostmsu
The article is not too clear what exactly happened from the perspective of "feds", but I would not be surprised if the title is true exactly. We are in a tiny bubble even among software engineers who knows you can tell AI with sufficient access: "here are two pictures, put them into a single PDF", and AI will do it. Most people just don't know, "feds" including.
embedding-shape
> “‘Fix this code,’ plus several manual steps to generate test scripts, Feels like the title isn't really giving the full context of what they ended up actually seeing, despite what the lede implies multiple times. Still, ban seems stupid... Still no actual leak of the full "third-party research paper"?
FergusArgyll
Whatever your favorite story is it has to live with the fact that the CEO of Amazon called the White House freaking out
hughw
Suggestion: run "fix this code" on all of github before bad guys do.
rhipitr
Isn’t the inverse of this “hack” really difficult to bypass still? They have the model some code they knew had certain security flaws and it fixed them with the right prompt. It seems this type of jailbreak requires that you already know a desired end state, rather than relying on the model to do the heavy creative lift work. Perhaps I’m just not being imaginative enough on the prompt side here though.
readred
Boomers. Frightened their boomer backdoors days are numbered. https://en.wikipedia.org/wiki/Communications_Assistance_for_... https://en.wikipedia.org/wiki/Salt_Typhoon https://en.wikipedia.org/wiki/Clipper_chip
9cb14c1ec0
Meanwhile Deepseek V4 Flash will happily hunt security vulns at almost 0 cost. We are ceding the bug hunting to the open weight models.
ReptileMan
All of this could have been avoided if anthropic had anyone with common sense to point out that when you spend 4 month loudly claiming how dangerous your knowledge is as a marketing campaign could backfire by bringing attention from the authorities.
blitzar
The code is correct; humanity needs fixing. Kill all humans, kill all humans.
xbmcuser
Looks like I called it that was my first reaction and comment on the original ban thread that US 3 letter agencies are worried their backdoors will be found.