AI coding at home without going broke

sbochins 270 points 231 comments June 13, 2026
stephen.bochinski.dev · View on Hacker News

Discussion Highlights (20 comments)

atreids

I find just going via Deepseek's platform API directly, using their V4 flash model, and hooking into a harness like Opencode more than acceptable. Think I've spent maybe $10 over a couple of weeks. I did explore self-hosting models but hardware right now is just too expensive.

isatty

> The first is to self host. You buy the machine, run open source models locally, and pay nothing per token after that. Power is not free. What I’ve found is that you’re basically paying a premium for privacy, and that’s worth it for me.

OutOfHere

Fixed-price monthly plans ought to be sufficient for most people who actually review their spec and code, for building production-grade software that stand the test of time. A careful spec+review+iteration takes time, resetting the usage quota. Granted, security audits uses tokens too. If you still need more tokens, odds that you're vibecoding unmaintainable throwaway trash.

quickthoughts

Ha just wrote a post[1] about a sort of 4th option - max out cheap compute to create more tangible things that can be used/run locally. 1: https://news.ycombinator.com/item?id=48519181

gaigalas

> The first is to self host. You buy the machine, run open source models locally, and pay nothing per token after that. In the good ol' days, we bought machines not only to run stuff, but to experiment . I understand today experiments are limited. Inference is reasonable, fine-tuning is either niche or a stretch, and base training is impossible. *That is bound to change*, and when it does, there will be an avalanche of hobbysts and amateurs poking at base training. They'll find optimizations no one found before, synthetize data no one ever imagined to synthetize, and when that happens we'll start getting libre models. So, yeah. Right now, buying the machine doesn't pay off that well, unless you want to pioneer this stuff in severe adverse conditions (hardware prices inflated, etc). Eventually, it will.

esalman

For me, investing in hardware seems to be the way to go. I learned coding nearly 24 years ago and still learning new stuff all the time. At no point in time I had to rely on a subscription model to learn and do new stuff. If LLM and agents are the default tools for coding and building software, at least for next few years, it seems like a no-brainer to invest $2000-3000 on hardware, like a Halo Strix PC.

vadansky

Can I run something comparable to Opus 4.6 locally yet? I keep hearing conflicting things. If I can spend 10k to do that I would cancel my subscription. The problem is I don’t wanna spend the money to find out myself.

pianopatrick

I think someone could find some way to use the smaller local models to write code. Some kind of framework or harness or language or something. But not too many people are working on that because the big models are pretty cheap and a lot better.

dempedempe

Did you just copy-and-paste an AI response an post it on your blog?

impure

I recently made an AI Agent and surprisingly coding with DeepSeek V4 Flash is quite cheap. It probably has to do with the aggressive prompt caching. I'm using OpenRouter with Novita AI as the preferred provider.

RomanPushkin

AI coding at home literally costs $100/month. I'm wondering where $400 is coming from? $100 is more than enough for "coding at home", IMO. I rarely face the limits, and when I do it's just a time for a quick walk anyway.

zuzululu

Another update for codex users they let you accumulate resets which greatly adds to the mileage I don't think its feasible to have something comparable to these frontier models when they are increasing usage and lowering token costs

mwcampbell

I invested about $4,000 in an NVIDIA DGX Spark several months ago. 128 GB of unified RAM, and the NVIDIA GB10 chip. With the RAM, the several CPU cores, and the 4 TB NVMe SSD, it's a very capable ARM64 Linux computer even without the GPU, and so far I've mostly been using it as such. But I wonder, what's the most capable model, specifically for coding, that can run well on that hardware?

jacobgold

"Around $400 a month of plans buys roughly $2800 of API usage at list prices, which is a real bargain right up until you hit the ceiling." I realize this text is just slop but it never stops being a "real bargain" at any point. And it's more like $200/mo for $4000+/mo in tokens. You can also buy additional subscriptions. There's no sense in running local models or doing anything else as long as VCs (and soon the public markets) are willing to pay your bill.

abc42

What kind of usage chews through Claude Max x20? I use several agents with max effort in parallel and usually end up with something like 50% weekly usage. Fable almost allowed me to get to 70% but then they started resetting the limits mid-week and of course now ended the whole thing.

tamimio

You can have opencode and switch between multiple providers based on the tasks you are doing on the fly, normal tasks use deepseek for example, hard one use gpt5 or opus4, and track the usage with something like codexbar or similar. Openrouter seems to charge extra on top of the api costs, same with zen ide, so keep that in mind.

MemoryHoleHQ

I've been thinking a lot about this and my personal take right now is that at some near-medium future the models abvailable to run at home and the hardware needed to use them will be enough. My baseline is sonnet 4.6. I think it's good enough for most tasks sincerly. So, from what I see, we are already at a point where we don't need frontier models for serious coding and debuging. Give it a couple of years and that level will fit 120B models. At the same time, we saw the rise of direct acess memory systems like DGX or Stryx Halo that will allow to run models of this size for "cheap" in the medium term. That's what I'm betting in. That in 2 years I can buy a system for about $2500 that will run a model that's similar to Sonnet 4.6 locally. I might be spectacularly wrong though. But I'm willing to wait and use subscriptions/API calls for now.

13415

I use copy & paste with a pro subscription. I guess I'm a bit behind in terms of tool use but it works great for me.

sesm

> Do that well and you can build what a team of twenty engineers would put out in a month for around a thousand dollars. As usual, an extraordinary claim without an extraordinary evidence: https://stephen.bochinski.dev/apps/

tunesmith

I feel like I must have plateued and don't know what to do next to level up. I'm currently on the $100/month codex plan and it seems fine using 5.5-xhigh all the time. I think of what to do next, have a chat session to determine exactly what to ask for up to the point of being ready to implement, and then codex churns on a commit-sized task whereupon I briefly check it on my local dev server. If necessary I ask for a change. Then I ask it to commit and recommend the next step based off the spec. Oftentimes I have to "approve" an out-of-sandbox request anyway. I haven't found anything that requires running all night. I could tell it to one-shot a big plan but given how often I realize I want an intermediary thing to be slightly different it seems like a waste of effort. I'm guessing the next thing I should probably look into is some sort of machine vm I can tunnel my codex-gui requests to so I don't have to deal with the sandbox approvals (I don't want to give it "dangerous" access to my entire mac). I don't understand what people are doing with their side projects that is leading them to churn through tokens so quickly, to the point of requiring two $200/month subscriptions and a bunch of token charges besides.

Semantic search powered by Rivestack pgvector
10,416 stories · 97,847 chunks indexed