Agent Safehouse – macOS-native sandboxing for local agents

atombender 479 points 111 comments March 08, 2026

agent-safehouse.dev · View on Hacker News

Discussion Highlights (20 comments)

garganzol

While we have `sandbox-exec` in macOS, we still don't have a proper Docker for macOS. Instead, the current Docker runs on macOS as a Linux VM which is useful but only as a Linux machine goes. Having real macOS Docker would solve the problem this project solves, and 1001 other problems.

xyzzy_plugh

This is just a wrapper around sandbox-exec. It's nice that there are a ton of presets that have been thought out, since 90% of wielding sandbox-exec is correctly scoping it to whatever the inner environment requires (the other 90% is figuring out how sandbox-exec works). I like that it's just a shell script. I do wish that there was a simple way to sandbox programs with an overlay or copy-on-write semantics (or better yet bind mounts). I don't care if, in the process of doing some work, an LLM agent modifies .bashrc -- I only care if it modifies _my_ .bashrc

gozucito

so this works the same as Claude Code /sandbox? The innovation being that it's harness-agnostic?

e1g

Creator here - didn't expect this to go public so soon. A few notes: 1. I built this because I like my agents to be local. Not in a container, not in a remote server, but running on my finely-tuned machine. This helps me run all agents on full-auto, in peace. 2. Yes, it's just a policy-generator for sandbox-exec. IMO, that's the best part about the project - no dependencies, no fancy tech, no virtualization. But I did put in many hours to identify the minimum required permissions for agents to continue working with auto-updates, keychain integration, and pasting images, etc. There are notes about my investigations into what each agent needs https://agent-safehouse.dev/docs/agent-investigations/ (AI-generated) 3. You don't even need the rest of the project and use just the Policy Builder to generate a single sandbox-exec policy you can put into your dotfiles https://agent-safehouse.dev/policy-builder.html

naomi_kynes

The "full-auto" framing is interesting. What happens when the agent hits something it can't resolve autonomously? Even sandboxed, there's a point where the agent needs to ask a question or get approval. Most setups handle this awkwardly: fire a webhook, write to a log, hope the human is watching. The sandbox keeps the agent contained, but doesn't give it a clean "pause and ask" primitive. The agent either guesses (risky) or silently fails (frustrating). Seems like there are two layers: the security boundary (sandbox-exec, containers, etc.) and the communication boundary (how does a contained agent reach the human?). This project nails the first. The second is still awkward for most setups.

tl2do

Intriguing, but... Around last summer (July–August 2025), I desperately needed a sandbox like this. I had multiple disasters with Claude Code and other early AI models. The worst was when Claude Code did a hard git revert to restore a single file, which wiped out ~1000 lines of development work across multiple files. But now, as of March 2026, at least in my experience, agents have become more reliable. With proper guardrails in claude.md and built-in safety measures, I haven't had a major incident in about 3 months. That said, layering multiple safeguards is always recommended—your software assets are your assets. I'd still recommend using something like this. But things are changing, bit by bit.

synparb

I’ve been playing around with https://nono.sh/ , which adds a proxy to the sandbox piece to keep credentials out of the agent’s scope. It’s a little worrisome that everyone is playing catch up on this front and many of the builtin solutions aren’t good.

vivid242

Nice! I‘d be interesting in the things that went wrong during development. Which loopholes were discovered last, if any?

mkagenius

A way to run claude code inside a apple container - $ container system start $ container run -d --name myubuntu ubuntu:latest sleep infinity $ container exec myubuntu bash -c "apt-get update -qq && apt-get install -y openssh-server" $ container exec myubuntu bash -c " apt-get install -y curl && curl -fsSL https://deb.nodesource.com/setup_lts.x | bash - && apt-get install -y nodejs " $ container exec myubuntu npm install -g @anthropic-ai/claude-code $ container exec myubuntu claude --version

dbmikus

I like that it's all bash. How does this compare with Codex's and Claude's built-in sandboxing?

pash

Sandvault [0] (whose author is around here somewhere), is another approach that combines sandbox-exe with the grand daddy of system sandboxes, the Unix user system. Basically, give an agent its own unprivileged user account (interacting with it via sudo, SSH, and shared directories), then add sandbox-exe on top for finer-grained control of access to system resources. 0. https://github.com/webcoyote/sandvault

zmmmmm

This is great to see. I honestly think that sandboxing is currently THE major challenge that needs to be solved for the tech to fully realise its potential. Yes the early adopters will YOLO it and run agents natively. It won't fly at all longer term or in regulated or more conservative corporate environments, let alone production systems where critical operations or data are in play. The challenge is that we need a much more sophisticated version of sandboxing than anybody has made before. We can start with network, file system and execute permissions - but we need way more than that. For example, if you really need an agent to use a browser to test your application in a live environment, capture screenshots and debug them - you have to give it all kinds of permissions that go beyond what can be constrained with a traditional sandboxing model. If it has to interact with resources that cost money (say, create cloud resources) then you need an agent aware cloud cost / billing constraint. Somehow all this needs to be pulled together into an actual cohesive approach that people can work with in a practical way.

nemo44x

Supervisor agent frameworks are going to be a big industry soon. You simply can’t have agents executing commands without a trusted supervisory layer examining and certifying actions. All the issues we get from AI today (hallucinations, goal shift, context decay, etc) get amplified unbelievably fast once you begin scaling agents out due to cascading. The risk being you go to bed and when you wake up your entire infrastructure is gone lol.

gnanagurusrgs

This is the right problem to solve. At Arcade, we see the same gap — agents get shell access, API keys, and network by default. The permissions model is backwards. sandbox-profiles is a solid primitive for local agents. The missing piece in production is the tool layer — even a sandboxed agent can still make dangerous API calls if the MCP tools it has access to aren't individually authed and scoped. The real stack is: sandbox the runtime (what Agent Safehouse does) + scope the tools (what we do with JIT OAuth at the MCP layer). Neither alone is enough. Nice work shipping this. https://www.arcade.dev/blog/ai-agent-auth-challenges-develop...

srid

If you are using Nix, there's also https://github.com/srid/sandnix that works on Linux (landrun) and macOS (sandbox-exec).

cjbarber

See also various sandbox tools I and others (e.g. jpeeler) have collected: https://news.ycombinator.com/item?id=47102258

davidcann

I made a native macOS app with a GUI for sandbox-exec, plus a network sandbox with per-domain filtering and secrets detection: https://multitui.com/

ashishb

I built something similar for myself that works on both Linux and Mac OS https://github.com/ashishb/amazing-sandbox

devonkelley

Sandboxing solves "prevent the agent from doing damage." The failure mode it doesn't catch is when the agent operates perfectly within its permissions and still produces garbage because the model degraded or the tool stopped returning useful results. That's a 200 OK the whole way down. "Prevent bad actions" and "detect wrong-but-permitted actions" are completely different problems.

varenc

fun fact about `sandbox-exec`, the macOS util this relies on: Apple officially deprecated it in macOS Sierra back in 2016! Its manpage has been saying it's deprecated for a decade now, yet we're continuing to find great uses for it. And the 'App Sandbox' replacement doesn't work at all for use cases like this where end users define their own sandbox rules. Hope Apple sees this usage and stops any plans to actually deprecate sandbox-exec. I recall a bunch of macOS internal services also rely on it.

Agent Safehouse – macOS-native sandboxing for local agents

Discussion Highlights (20 comments)

Related Discussions