AI agents that argue with each other to improve decisions

rockcat12 27 points 16 comments April 25, 2026
github.com · View on Hacker News

Discussion Highlights (7 comments)

oldsecondhand

Sounds like a less efficient version of the mixture of experts approach.

zby

I don't know - looks like an interesting idea - but ... I am struggling to put that in a polite manner. When I go into the repo and find out that it does stuff like lip syncing of talking avatars then I start to think what percentage of the development effort goes into marketing?

gertlabs

Self organizing systems is an area of research to which I think LLMs will contribute immensely. But as of now, even newer AI models are not particularly insightful. I'm always surprised by how suboptimal near-frontier LLMs are at collaborating in some of the easier cooperative environments on my benchmarking and RL platform. For example, check out a replay of consensus grid here: https://gertlabs.com/spectate

ChadMoran

I've been doing this with Claude Code and agent teams. I have a /red-team skill that will use an agent team to criticize it's own work, grade and rank feedback, incorporate relevant feedback and then start over. It has increased the quality of output.

submeta

I had good results with combining Claude Code with Codex, let them have back and forth sessions. Their prompts were magnitudes better than mine, also their evaluation and criticism of the other LLM What I haven’t taken time for is finding out about how I‘d automate their back-and-forth and stop manually copy/pasting their responses.

oofda

I do this with Gemini and local models Gemini is the planner and researcher, local models basically "just type syntax" Seems to make it so none of them get stuck in a loop

bit1993

If you agree that there is no absolute truth in complex problems, and that different parties can have different perspectives which are all true/correct from their point of view than this type of systems can be far more inefficient to make decisions compared to a single entity. It's like having multiple CEOs in a company, or design by commity.

Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed