AI overly affirms users asking for personal advice

oldfrenchfries 585 points 446 comments March 28, 2026
news.stanford.edu · View on Hacker News

https://arxiv.org/abs/2602.14270 https://www.science.org/doi/10.1126/science.aec8352

Discussion Highlights (20 comments)

oldfrenchfries

This new Stanford study published on March 26, 2026 shows that AI models are sycophantic. They affirm the users position 49% more often than a human would. The researchers found that when people use AI for relationship advice, they become 25% more convinced they are 'right' and significantly less likely to apologize or repair the connection.

oldfrenchfries

There is a striking data visualization showing the breakup advice trend over 15 years on Reddit. You can see the "End relationship" line spike as AI and algorithmic advice take over: https://www.reddit.com/r/dataisbeautiful/comments/1o87cy4/oc...

deeg

I do find them cloying at times. I was using Gemini to iterate over a script and every time I asked it to make a change it started a bunch of responses with "that's a smart final step for this task! ...".

xiphias2

Marc Andereseen has talked about the downside of RLHF: it's a specific group of liberal low income people in California who did the rating, so AI has been leaning their culture. I think OpenAI tried to diversify at least the location of the raters somewhat, but it's hard to diversify on every level.

masteranza

We can surely fix it and we probably should. However, I don't think AI is doing any worse here than friends advice when they here a one sided story. The only difference being that it's not getting studied. Conversely, AI chatbots are great mediators if both parties are present in the conversation.

tom-blk

Not surprising, but nice that we have actual data now

152334H

Maybe it's not so sensible to offload the responsibility of clear thinking to AI companies? How is a chatbot supposed to determine when a user fools even themselves about what they have experienced? What 'tough love' can be given to one who, having been so unreasonable throughout their lives - as to always invite scorn and retort from all humans alike - is happy to interpret engagement at all as a sign of approval?

sublinear

I think if you're at the stage of life where you even need to ask, the AI might be doing everyone a favor. As much as people whine about the birth rate and whatever else, I think it's a net good that people spend a lot more time alone to mature. Good relationships are underappreciated.

graemep

There are plenty of sycophantic humans around, especially with regard to relationship advice. I find there is an inverse relationship between how willing people are to give relationship advice, and how good their advice is (whether looking at sycophancy or other factors).

megous

Can't you just prompt for a critical take, multiple alternative perspectives (specifically not yours, after describing your own), etc.? It's a tool, I can bang my hand on purpose with a hammer, too.

awithrow

It feels like I'm fighting uphill battle when it comes to bouncing ideas off of a model. I'll set things up in the context with instructions similar to. "Help me refine my ideas, challenge, push back, and don't just be agreeable." It works for a bit but eventually the conversation creeps back into complacency and syncophancy. I'll check it too by asking "are you just placating me?" the funny thing is that often it'll admit that, yes, it wasn't being very critical, and then procede to over correct and become a complete contrarian. and not in a way that's useful either. very frustrating. I've found that Opus 4.6 is worse about this than 4.5. 4.5 does a better job IMO of following instructions and not drifting into the mode where it acts like everything i say is a grand revelation from up high.

justin_dash

So at this point I think it's pretty obvious that RLHFing LLMs to follow instructions causes this. I'm interested in a loop of ["criticize this code harshly" -> "now implement those changes" -> open new chat, repeat]: If we could graph objective code quality versus iterations, what would that graph look like? I tried it out a couple of times but ran out of Claude usage. Also, how those results would look like depending on how complete of a set of specs you give it.

neya

WTF is "yes-men"? Orignal title: AI overly affirms users asking for personal advice Dear mods, can we keep the title neutral please instead of enforcing gender bias?

svara

Yeah, and if you ask it to be critical specifically to get a different perspective or just to avoid this bias, it'll go over the top in the opposite direction. This is imo currently the top chatbot failure mode. The insidious thing is that it often feels good to read these things. Factual accuracy by contrast has gotten very good. I think there's a deeper philosophical dimension to this though, in that it relates to alignment. There are situations where in the grand scheme of things the right thing to do would be for the chatbot to push back hard, be harsh and dismissive. But is it the really aligned with the human then? Which human?

righthand

LLMs are syncophatic digital lawyers that will tell you what you want to hear until you look at the price tag and say “how much did I spend?!”

gurachek

I had exactly this between two LLMs in my project. An evaluator model that was supposed to grade a coaching model's work. Except it could see the coach's notes, so it just... agreed with everything. Coach says "user improved on conciseness", next answer is shorter, evaluator says yep great progress. The answer was shorter because the question was easier lol. I only caught it because I looked at actual score numbers after like 2 weeks of thinking everything was fine. Scores were completely flat the whole time. Fix was dumb and obvious — just don't let the evaluator see anything the coach wrote. Only raw scores. Immediately started flagging stuff that wasn't working. Kinda wild that the default behavior for LLMs is to just validate whatever context they're given.

bryanrasmussen

somewhere an AI chatbot is reading this and confirming eagerly that this is indeed one of its problems and vowing to do better next time.

fathermarz

This is a skill in life with people as much as it is with LLMs. One should always question everything and build strongman arguments for one’s self. Using a pros and cons approach brings it back to reality in most cases, especially when it comes to _serious matters_. It’s less about “challenge my thinking” and more about playing it out in long tail scenarios, thought exercises, mental models, and devils advocate.

jordanb

Billionaires love AI chatboats so much because they invented the digital Yes-man. They agree obsequiously with everything we say to them. Unfortunately for the rest of us we don't really have the resources to protect ourselves from our bad decisions and really need that critical feedback.

stared

There is a fine line between "following my instructions" (is what I want it to do) vs "thinking all I do is great" (risky, and annoying). A good engineer will also list issues or problems, but at the same time won't do other than required because (s)he "knows better". The worst is that it is impossible to switch off this constant praise. I mean, it is so ingrained in fine tuning, that prompt engineering (or at least - my attempts) just mask it a bit, but hard to do so without turning it into a contrarian. But I guess the main issue (or rather - motivation) is most people like "do I look good in this dress?" level of reassurance (and honesty). It may work well for style and decoration. It may work worse if we design technical infrastructure, and there is more ground truth than whether it seems nice.

Semantic search powered by Rivestack pgvector
3,471 stories · 32,344 chunks indexed