Context Sculpting
perceptronblues
15 points
6 comments
June 06, 2026
Related Discussions
Found 5 related stories in 98.4ms across 10,002 title embeddings via pgvector HNSW
- REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image ibobev · 26 pts · June 03, 2026 · 47% similar
- LingBot-Map: Streaming 3D reconstruction with geometric context transformer nateb2022 · 16 pts · April 28, 2026 · 46% similar
- We reproduced Anthropic's Mythos findings with public models __natty__ · 99 pts · April 17, 2026 · 44% similar
- Headless Everything for Personal AI markusw · 19 pts · April 18, 2026 · 43% similar
- Show HN: A 3D Body Scan for Nine Cents – Without SMPL arkadiuss · 11 pts · March 30, 2026 · 43% similar
Discussion Highlights (4 comments)
JSR_FDED
All this mucking about with harnesses and context is really just Markdown engineering.
0gs
i definitely considered something like this for the local-first harness i made ... i just don't think most people have the RAM to be able to run two good models yet. maybe i'm wrong though. but i also think a single "agent" can compartmentalize itself into subdivisions better than we imagine (i.e., much much better than any single human can). i ended up creating a broker, though, so at least the tool calls don't eat up as much context. and the auto-reset thing is definitely legit.
theowaway213456
Not a single mention of prompt caching in this article, which is a massive benefit of append-only context.
andai
See also: agent harness in 50 lines (based on mini-swe-agent). https://minimal-agent.com/ I followed this tutorial earlier today and I'm having a lot of fun with it. https://gist.github.com/a-n-d-a-i/cb5e929b4c87b8d185760d0264... I added a 2nd while loop so that it takes user input. And vendored my tiny llm lib (so it's 150 lines now, and dependency free :) --- As for context-sculpting, the economics are different when not touching the context gives you the >98% discount everyone's doing now. (Although it might be worth fiddling with the suffix ... not sure yet!) e.g. this issue: "ToolSearch saves ~15K tokens per request in prompt size, but at the cost of breaking prefix-based caching for models like DeepSeek that rely on stable prefixes. For heavy users of DeepSeek through OpenRouter, the savings from smaller prompts are dwarfed by the increased cost from cache misses." https://github.com/QwenLM/qwen-code/discussions/4065