Will It Mythos?

mindingnever 73 points 39 comments June 23, 2026
swelljoe.com · View on Hacker News

Discussion Highlights (10 comments)

jrochkind1

> And, all of the bugs can be identified by several models if they are pointed directly at it and told what to look for. This made me think, well, sure, if you tell them what to look for... but then: > The models can look at the whole repo, and follow logic across file boundaries, but they’re not told what to look for. So okay, the first one was an accidental mis-statement?

reinitctxoffset

Opus 4 class models are terrifying at infosec. They tie their shoelaces together on other things, but don't fuck with them on that. It's a savant thing. A cursory reading of the model card shows Mythos/Fable is a fine tune on Project Zero with some steering on persistence. But I think it's a valuable lesson: advertise your product as a nuclear weapon while microdosing at Lighthaven to enough Davos attendees and sooner or later? Someone is going to evaluate the claim from a chair where you act first and nuance later. Wild that Amodei's blog and pod circuit are the greatest IPO risk.

jaggederest

In my brief experience, the difference between fable and opus is largely in persistence, not global intelligence like you might expect. Fable just... goes the extra mile, sometimes in a scary way.

mixmastamyk

Could someone point the thing at Ventoy please?

bottlepalm

Surprise.. someone downplaying Mythos/Fable that didn't actually use it. Plenty of comments here to the contrary, including my own personal experience with Fable was easily a step change in capability over Opus - figuring things out in reverse engineering binaries that Opus plain couldn't find.

GeorgeWoff25

Spatial reasoning is where fable really separates itself imo

po1nt

From all the things I read I'm pretty convinced that Mythos is just standard LLM with safety features turned off. If current models weren't reluctant to search for vulnerabilities, they might perform as good as Mythos.

fsadsadsdasdas

事実は小説よりも奇なり

Tossrock

As I posted in another comment, I found Fable to be substantially more powerful than any previous model. However, this isn't just an ungrounded opinion - I uploaded my full session transcript and code created working on a very complex implementation, so people can judge for themselves, if they're interested: https://tossrock.substack.com/p/36-hours-with-fable

wald3n

The benchmark fills an interesting niche, but the methods need work considering how many caveats are included in the results.

Semantic search powered by Rivestack pgvector
11,301 stories · 106,340 chunks indexed