Tinybox – Offline AI device 120B parameters
albelfio
414 points
251 comments
March 21, 2026
Related Discussions
Found 5 related stories in 56.8ms across 3,663 title embeddings via pgvector HNSW
- GPT‑5.4 Mini and Nano meetpateltech · 217 pts · March 17, 2026 · 49% similar
- Mistral Small 4 pember · 56 pts · March 16, 2026 · 46% similar
- MacBook M5 Pro and Qwen3.5 = Local AI Security System aegis_camera · 158 pts · March 20, 2026 · 46% similar
- BitNet: 100B Param 1-Bit model for local CPUs redm · 326 pts · March 11, 2026 · 46% similar
- Can I run AI locally? ricardbejarano · 1103 pts · March 13, 2026 · 45% similar
Discussion Highlights (20 comments)
wongarsu
Sound like solid prebuilt with well balanced components and a pretty case Not revolutionary in any way, but nice. Unless I'm missing something here?
heinternets
exabox - 720x RDNA5 AT0 XL 25,920 GB VRAM 23,040 GB System RAM ~ $10 Million Who is the target market here?
vlovich123
Surprising to see this with AMD GPUs considering how George famously threw up his hands as AMD not being worth working with.
vessenes
The exabox is interesting. I wonder who the customer is; after watching the Vera Rubin launch, I cannot imagine deciding I wanted to compete with NVIDIA for hyperscale business right now. Maybe it’s aiming at a value-conscious buyer? Maybe it’s a sensible buy for a (relatively) cash-strapped ML startup; actually I just checked prices, and it looks like Vera Rubin costs half for a similar amount of GPU RAM. I’m certain that the interconnect will not be as good as NV’s. I have no idea who would buy this. Maybe if you think Vera Rubin is three years out? But NV ships, man, they are shipping.
orliesaurus
I wonder if this is frontpage right now because of the other tiiny (the names are similar) video that went viral ... which turns out wasn't an actual product by the tinygrad linked in this post[1] [1] https://x.com/ShriKaranHanda/status/2035284883384553953
comrade1234
Cool that you have a dual power supply model. It says rack mountable or free standing. Does that mean two form factors? $65K is more than we can afford right now but we are definitely eventually in the market for something we can run in our own colo. It's funny though... we're using deepseek now for features in our service and based on our customer-type we thought that they would be completely against sending their data to a third-party. We thought we'd have to do everything locally. But they seem ok with deepseek which is practically free. And the few customers that still worry about privacy may not justify such a high price point.
jauntywundrkind
My interest in anything associated with geohot took a colossal nose dive today after seeing this post against democracy, quoting frelling M*ncius M*ldbug: Democracy is a Liability. https://news.ycombinator.com/item?id=47469543 https://geohot.github.io//blog/jekyll/update/2026/03/21/demo... Theres a lot there that makes sense & I think needs to be considered. But a lot just seems to be out of the blue, included without connection, in my view. Feels like maybe are in-grouo messages, that I don't understand. How this is headered as against democracy is unclear to me, and revolting. I both think we must grapple with the world as it is, and this post is in that area, strongly, but to let fear be the dominant ruling emotion is one of the main definitions of conservativism, and it's use here to scare us sounds bad.
ivraatiems
There's some irony in the fact that this website reads as extremely NOT AI-generated, very human in the way it's designed and the tone of its writing. Still, this is a great idea, and one I hope takes off. I think there's a good argument that the future of AI is in locally-trained models for everyone, rather than relying on a big company's own model. One thought: The ability to conveniently get this onto a 240v circuit would be nice. Having to find two different 120v circuits to plug this into will be a pain for many folks.
droidjj
Adding this to my list of ~beautifully~ designed things to buy when I win the lottery.
throwatdem12311
Finally, a computer that should be able to run Monster Hunter Wilds with decent performance. But let’s be real, 12k is kinda pushing it - what kind of people are gonna spend $65k or even $10M (lmao WTAF) on a boutique thing like this. I dont think these kinds of things go in datacenters (happy to be corrected) and they are way too expensive (and probably way too HOT) to just go in a home or even an office “closet”.
mayukh
What’s the most effective ~$5k setup today? Interested in what people are actually running.
operatingthetan
The incremental price increases between products is funny. $12,000, $65,000, $10,000,000.
sudo_cowsay
I always wonder about these expensive products: Does the company make them once its ordered or do they just make them beforehand?
bastawhiz
There's no way the red v2 is doing anything with a 120b parameter model. I just finished building a dual a100 ai homelab (80gb vram combined with nvlink). Similar stats otherwise. 120b only fits with very heavy quantization, enough to make the model schizophrenic in my experience. And there's no room for kv, so you'll OOM around 4k of context. I'm running a 70b model now that's okay, but it's still fairly tight. And I've got 16gb more vram then the red v2. I'm also confused why this is 12U. My whole rig is 4u. The green v2 has better GPUs. But for $65k, I'd expect a much better CPU and 256gb of RAM. It's not like a threadripper 7000 is going to break the bank. I'm glad this exists but it's... honestly pretty perplexing
operatingthetan
Are we at the point where 2x 9070XT's are a viable LLM platform? (I know this has 4, just wondering for myself).
ekropotin
IDK, I feel it’s quite overpriced, even with the current component prices. I almost sure it’s possible to custom build a machine as powerful as their red v2 within 9k budget. And have a lot of fun along the way.
himata4113
exabox reads as if it was making a joke of something or someone. if it's real then it's really interesting!
andai
Can someone explain the exabox? They say it "functions as a single GPU". Is there anything like that currently existing?
ppap3
I thought there was a typo in the price
mmoustafa
I would love to see real-life tokens/sec values advertised for one or various specific open source models. I'm currently shopping for offline hardware and it is very hard to estimate the performance I will get before dropping $12K, and would love to have a baseline that I can at least always get e.g. 40 tok/s running GPT-OSS-120B using Ollama on Ubuntu out of the box.