Simple Meta-Harness on Islo.dev
zozo123-IB
49 points
18 comments
May 05, 2026
Related Discussions
Found 5 related stories in 92.1ms across 8,303 title embeddings via pgvector HNSW
- Learn Harness Engineering redbell · 132 pts · May 18, 2026 · 59% similar
- Show HN: Browser Harness – Gives LLM freedom to complete any browser task gregpr07 · 96 pts · April 24, 2026 · 54% similar
- Anthropic to limit Using third-party harnesses with Claude subscriptions guiyuwei · 15 pts · April 03, 2026 · 50% similar
- Agent-harness-kit scaffolding for multi-agent workflows (MCP, provider-agnostic) enmanuelmag · 76 pts · May 07, 2026 · 47% similar
- The agent harness belongs outside the sandbox shad42 · 87 pts · May 02, 2026 · 47% similar
Discussion Highlights (7 comments)
love2read
I have no idea what this does or is. I really wish they could have given a better description of why this is useful.
cyanydeez
serious question: I've already got a opencode harness running on a local model. It's easily installable via the insecure bash command. It's already tailored with a couple of plugins and with a proper TODO.md and planning, I can get it to loop fine with proper attention to its pratfalls on vague/non-determinant language. It's all running on a AMD 395+ Qwen3-Coder-Next model with ~256k context. opencode has a webui I can put behind a password protected endpoint and keep it busy from anywhere I need to via a simple nginx proxy. How does this go above and beyond this straightforward opensource, open weights and relatively cheap setup? Do you just get more tokens from SOTA models? Can anyone rationally say the products of token production are quality and secure?
m3kw9
This seems to be another over optimization for AI that many are trying to get into. The LLM's improve, and your setup is deprecated, you wasted time optimizing for a slight edge. TDLR: You trade time for slight edge.
mccoyb
It has now become fashionable to dress oneself in the garb of science to sell dev environments ... for agents. It has now become fashionable to claim much, and furnish little. It has now become fashionable to fail to understand or state the core of your proposal in as few words as possible: instead of "genetic algorithm applied to the space of harnesses, parallelized by our infrastructure" we get "Three swaps. Same orchestrator. Same dashboard. The wiring is the thing." We're cooked chat.
vmg12
This is not how I've seen the term meta-harness be used. The common usage I've seen has been for a meta-harness to be a wrapper around an existing agent to give that agent a new ui or abilities.
visarga
I did this too, ablating all the components in my coding agent harness. The insight from my meta-optimization loops was "have judge agents review the plan and implementation". One of my own insights here is that you need to collect not just execution traces, but all the human-in-the-loop nudges and steering commands. They are one of the purest sources of feedback on coding agents when seen in context. I agree with OP on the need to collect traces and compare them, not just scores. It is a much richer source of feedback. If anyone is interested I have a slide deck about my approach: https://horiacristescu.github.io/claude-playbook-plugin/docs...
gobdovan
I distilled the Meta-Harness workflow in a skill [0]. Unlike the original Islo POC, which demonstrates an automated runtime loop converging from trace-rich evaluations [1], my test only evaluates whether the distilled skill improves a lead agent's prompt-repair discipline and audit trail [2]. It took a few tries to figure out what to test in the first place, since it is not obvious what the workflow should improve (prompt? guided agent ability?). So, the only meaningful test I ended up with was giving easy tasks, but with a deliberately misleading/incomplete prompt, then testing whether persisting deltas and observations between successive prompts meaningfully improves a meta-agent's ability to correct the imprecise prompt (what I mean by "prompt-repair discipline and audit trail") [2]. From a couple more experiments (summarized in [2]), I found that the Meta-agent does not really have an effect on how well the guided agents perform, but simply improves imprecise prompts better. My conclusion is that this method works to improve bad prompts, I didn't demonstrate that improves guided agent capabilities. However, I think it's better to work on your prompts before giving them to agents instead of giving bad prompts and iterating on them with a meta-agent. [0]: https://github.com/ouatu-ro/skill-distillery/blob/main/skill... [1]: https://github.com/zozo123/meta-harness-on-islo [2]: https://github.com/ouatu-ro/skill-distillery/blob/main/repor...