AI coding at home without going broke
sbochins
270 points
231 comments
June 13, 2026
Related Discussions
Found 5 related stories in 125.5ms across 10,416 title embeddings via pgvector HNSW
- AI coding is gambling speckx · 321 pts · March 18, 2026 · 71% similar
- Using AI to write better code more slowly signa11 · 405 pts · May 25, 2026 · 62% similar
- AI is too expensive crescit_eundo · 135 pts · May 19, 2026 · 62% similar
- An AI coding agent, used to write code, needs to reduce your maintenance costs cratermoon · 104 pts · May 10, 2026 · 62% similar
- Show HN: I tested 15 free AI models at building real software on a $25/year VPS j0rg3 · 17 pts · April 02, 2026 · 61% similar
Discussion Highlights (20 comments)
atreids
I find just going via Deepseek's platform API directly, using their V4 flash model, and hooking into a harness like Opencode more than acceptable. Think I've spent maybe $10 over a couple of weeks. I did explore self-hosting models but hardware right now is just too expensive.
isatty
> The first is to self host. You buy the machine, run open source models locally, and pay nothing per token after that. Power is not free. What I’ve found is that you’re basically paying a premium for privacy, and that’s worth it for me.
OutOfHere
Fixed-price monthly plans ought to be sufficient for most people who actually review their spec and code, for building production-grade software that stand the test of time. A careful spec+review+iteration takes time, resetting the usage quota. Granted, security audits uses tokens too. If you still need more tokens, odds that you're vibecoding unmaintainable throwaway trash.
quickthoughts
Ha just wrote a post[1] about a sort of 4th option - max out cheap compute to create more tangible things that can be used/run locally. 1: https://news.ycombinator.com/item?id=48519181
gaigalas
> The first is to self host. You buy the machine, run open source models locally, and pay nothing per token after that. In the good ol' days, we bought machines not only to run stuff, but to experiment . I understand today experiments are limited. Inference is reasonable, fine-tuning is either niche or a stretch, and base training is impossible. *That is bound to change*, and when it does, there will be an avalanche of hobbysts and amateurs poking at base training. They'll find optimizations no one found before, synthetize data no one ever imagined to synthetize, and when that happens we'll start getting libre models. So, yeah. Right now, buying the machine doesn't pay off that well, unless you want to pioneer this stuff in severe adverse conditions (hardware prices inflated, etc). Eventually, it will.
esalman
For me, investing in hardware seems to be the way to go. I learned coding nearly 24 years ago and still learning new stuff all the time. At no point in time I had to rely on a subscription model to learn and do new stuff. If LLM and agents are the default tools for coding and building software, at least for next few years, it seems like a no-brainer to invest $2000-3000 on hardware, like a Halo Strix PC.
vadansky
Can I run something comparable to Opus 4.6 locally yet? I keep hearing conflicting things. If I can spend 10k to do that I would cancel my subscription. The problem is I don’t wanna spend the money to find out myself.
pianopatrick
I think someone could find some way to use the smaller local models to write code. Some kind of framework or harness or language or something. But not too many people are working on that because the big models are pretty cheap and a lot better.
dempedempe
Did you just copy-and-paste an AI response an post it on your blog?
impure
I recently made an AI Agent and surprisingly coding with DeepSeek V4 Flash is quite cheap. It probably has to do with the aggressive prompt caching. I'm using OpenRouter with Novita AI as the preferred provider.
RomanPushkin
AI coding at home literally costs $100/month. I'm wondering where $400 is coming from? $100 is more than enough for "coding at home", IMO. I rarely face the limits, and when I do it's just a time for a quick walk anyway.
zuzululu
Another update for codex users they let you accumulate resets which greatly adds to the mileage I don't think its feasible to have something comparable to these frontier models when they are increasing usage and lowering token costs
mwcampbell
I invested about $4,000 in an NVIDIA DGX Spark several months ago. 128 GB of unified RAM, and the NVIDIA GB10 chip. With the RAM, the several CPU cores, and the 4 TB NVMe SSD, it's a very capable ARM64 Linux computer even without the GPU, and so far I've mostly been using it as such. But I wonder, what's the most capable model, specifically for coding, that can run well on that hardware?
jacobgold
"Around $400 a month of plans buys roughly $2800 of API usage at list prices, which is a real bargain right up until you hit the ceiling." I realize this text is just slop but it never stops being a "real bargain" at any point. And it's more like $200/mo for $4000+/mo in tokens. You can also buy additional subscriptions. There's no sense in running local models or doing anything else as long as VCs (and soon the public markets) are willing to pay your bill.
abc42
What kind of usage chews through Claude Max x20? I use several agents with max effort in parallel and usually end up with something like 50% weekly usage. Fable almost allowed me to get to 70% but then they started resetting the limits mid-week and of course now ended the whole thing.
tamimio
You can have opencode and switch between multiple providers based on the tasks you are doing on the fly, normal tasks use deepseek for example, hard one use gpt5 or opus4, and track the usage with something like codexbar or similar. Openrouter seems to charge extra on top of the api costs, same with zen ide, so keep that in mind.
MemoryHoleHQ
I've been thinking a lot about this and my personal take right now is that at some near-medium future the models abvailable to run at home and the hardware needed to use them will be enough. My baseline is sonnet 4.6. I think it's good enough for most tasks sincerly. So, from what I see, we are already at a point where we don't need frontier models for serious coding and debuging. Give it a couple of years and that level will fit 120B models. At the same time, we saw the rise of direct acess memory systems like DGX or Stryx Halo that will allow to run models of this size for "cheap" in the medium term. That's what I'm betting in. That in 2 years I can buy a system for about $2500 that will run a model that's similar to Sonnet 4.6 locally. I might be spectacularly wrong though. But I'm willing to wait and use subscriptions/API calls for now.
13415
I use copy & paste with a pro subscription. I guess I'm a bit behind in terms of tool use but it works great for me.
sesm
> Do that well and you can build what a team of twenty engineers would put out in a month for around a thousand dollars. As usual, an extraordinary claim without an extraordinary evidence: https://stephen.bochinski.dev/apps/
tunesmith
I feel like I must have plateued and don't know what to do next to level up. I'm currently on the $100/month codex plan and it seems fine using 5.5-xhigh all the time. I think of what to do next, have a chat session to determine exactly what to ask for up to the point of being ready to implement, and then codex churns on a commit-sized task whereupon I briefly check it on my local dev server. If necessary I ask for a change. Then I ask it to commit and recommend the next step based off the spec. Oftentimes I have to "approve" an out-of-sandbox request anyway. I haven't found anything that requires running all night. I could tell it to one-shot a big plan but given how often I realize I want an intermediary thing to be slightly different it seems like a waste of effort. I'm guessing the next thing I should probably look into is some sort of machine vm I can tunnel my codex-gui requests to so I don't have to deal with the sandbox approvals (I don't want to give it "dangerous" access to my entire mac). I don't understand what people are doing with their side projects that is leading them to churn through tokens so quickly, to the point of requiring two $200/month subscriptions and a bunch of token charges besides.