Shall I implement it? No
breton
1090 points
417 comments
March 12, 2026
Related Discussions
Found 5 related stories in 190.8ms across 3,471 title embeddings via pgvector HNSW
- Code Review for Claude Code adocomplete · 67 pts · March 09, 2026 · 39% similar
- Made an "Influencer Pricing Analyzer" for myself. Should I launch this? [video] bozkan · 17 pts · March 27, 2026 · 39% similar
- If AI writes code, should the session be part of the commit? mandel_x · 117 pts · March 02, 2026 · 39% similar
- Don't Wait for Claude jeapostrophe · 27 pts · March 27, 2026 · 39% similar
- Please do not A/B test my workflow ramoz · 161 pts · March 14, 2026 · 39% similar
Discussion Highlights (20 comments)
yfw
Seems like they skipped training of the me too movement
dimgl
Yeah this looks like OpenCode. I've never gotten good results with it. Wild that it has 120k stars on GitHub.
verdverm
Why is this interesting? Is it a shade of gray from HN's new rule yesterday? https://news.ycombinator.com/item?id=47340079 Personally, the other Ai fail on the front of HN and the US Military killing Iranian school girls are more interesting than someone's poorly harnessed agent not following instructions. These have elements we need to start dealing with yesterday as a society. https://news.ycombinator.com/item?id=47356968 https://www.nytimes.com/video/world/middleeast/1000000107698...
thisoneworks
It'll be funny when we have Robots, "The user's facial expression looks to be consenting, I'll take that as an encouraging yes"
mildred593
Never trust a LLM for anything you care about.
XCSme
Claude is quite bad at following instructions compared to other SOTA models. As in, you tell it "only answer with a number", then it proceeds to tell you "13, I chose that number because..."
et1337
This was a fun one today: % cat /Users/evan.todd/web/inky/context.md Done — I wrote concise findings to: `/Users/evan.todd/web/inky/context.md`%
sssilver
I wonder if there's an AGENTS.md in that project saying "always second-guess my responses", or something of that sort. The world has become so complex, I find myself struggling with trust more than ever.
reconnecting
I’m not an active LLMs user, but I was in a situation where I asked Claude several times not to implement a feature, and that kept doing it anyway.
skybrian
Don't just say "no." Tell it what to do instead. It's a busy beaver; it needs something to do.
sid_talks
I’m still surprised so many developers trust LLMs for their daily work, considering their obvious unreliability.
kfarr
What else is an LLM supposed to do with this prompt? If you don’t want something done, why are you calling it? It’d be like calling an intern and saying you don’t want anything. Then why’d you call? The harness should allow you to deny changes, but the LLM has clearly been tuned for taking action for a request.
bitwize
Should have followed the example of Super Mario Galaxy 2, and provided two buttons labelled "Yeah" and "Sure".
golem14
Obligatory red dwarf quote: TOASTER: Howdy doodly do! How's it going? I'm Talkie -- Talkie Toaster, your chirpy breakfast companion. Talkie's the name, toasting's the game. Anyone like any toast? LISTER: Look, _I_ don't want any toast, and _he_ (indicating KRYTEN) doesn't want any toast. In fact, no one around here wants any toast. Not now, not ever. NO TOAST. TOASTER: How 'bout a muffin? LISTER: OR muffins! OR muffins! We don't LIKE muffins around here! We want no muffins, no toast, no teacakes, no buns, baps, baguettes or bagels, no croissants, no crumpets, no pancakes, no potato cakes and no hot-cross buns and DEFINITELY no smegging flapjacks! TOASTER: Aah, so you're a waffle man! LISTER: (to KRYTEN) See? You see what he's like? He winds me up, man. There's no reasoning with him. KRYTEN: If you'll allow me, Sir, as one mechanical to another. He'll understand me. (Addressing the TOASTER as one would address an errant child) Now. Now, you listen here. You will not offer ANY grilled bread products to ANY member of the crew. If you do, you will be on the receiving end of a very large polo mallet. TOASTER: Can I ask just one question? KRYTEN: Of course. TOASTER: Would anyone like any toast?
Nolski
Strange. This is exactly how I made malus.sh
rvz
To LLMs, they don't know what is "No" or what "Yes" is. Now imagine if this horrific proposal called "Install.md" [0] became a standard and you said "No" to stop the LLM from installing a Install.md file. And it does it anyway and you just got your machine pwned. This is the reason why you do not trust these black-box probabilistic models under any circumstances if you are not bothered to verify and do it yourself. [0] https://www.mintlify.com/blog/install-md-standard-for-llm-ex...
marcosdumay
"You have 20 seconds to comply"
aeve890
Claudius Interruptus
sgillen
To be fair to the agent... I think there is some behind the scenes prompting from claude code (or open code, whichever is being used here) for plan vs build mode, you can even see the agent reference that in its thought trace. Basically I think the system is saying "if in plan mode, continue planning and asking questions, when in build mode, start implementing the plan" and it looks to me(?) like the user switched from plan to build mode and then sent "no". From our perspective it's very funny, from the agents perspective maybe it's confusing. To me this seems more like a harness problem than a model problem.
HarHarVeryFunny
This is why you don't run things like OpenClaw without having 6 layers of protection between it and anything you care about. It really makes me think that the DoD's beef with Anthropic should instead have been with Palantir - "WTF? You're using LLMs to run this ?!!!" Weapons System: Cruise missile locked onto school. Permission to launch? Operator: WTF! Hell, no! Weapons System: <thinking> He said no, but we're at war. He must have meant yes <thinking> OK boss, bombs away !!