How "Hardwired" AI Will Destroy Nvidia's Empire and Change the World
amelius
19 points
12 comments
March 14, 2026
Related Discussions
Found 5 related stories in 56.0ms across 3,471 title embeddings via pgvector HNSW
- The Inference Shift – How Cheap Chips Could Put Frontier AI in Everyone's Hands arcanemachiner · 11 pts · March 31, 2026 · 60% similar
- Nvidia CEO Jensen Huang says 'I think we've achieved AGI' mmastrac · 15 pts · March 23, 2026 · 59% similar
- Nvidia CEO Jensen Huang says 'I think we've achieved AGI' iugtmkbdfil834 · 11 pts · March 24, 2026 · 59% similar
- Nvidia's Huang pitches AI tokens on top of salary wmat · 19 pts · March 20, 2026 · 57% similar
- AI-written code will be dead by 2036 riffonio · 14 pts · March 02, 2026 · 56% similar
Discussion Highlights (8 comments)
amelius
It's crazy. In a few years we will be able to buy Qwen on a chip, doing 10K tokens per second.
comandillos
This is still far away from being viable for actually useful models, like bigger MoE ones with much larger context windows. I mean, the technology is very promising just like Cerebras, but we need to see whether they are able to keep up this with the evolution of the models to come in the next few years. Extremely interesting nevertheless.
spzb
Is this a paid ad placement? I'm seeing a load of breathless "commentary" on Taalas and next to no serious discussion about whether their approach is even remotely scalable. A one-off tech demo using a comparatively ancient open source model is hardly going to be giving Jensen Huang sleepless nights.
androiddrew
Give me a 120B dense model on one of these and yeah my API use will probably drop.
exabrial
I always thought once we have the models figured out, getting the meat of it into an FPGA was probably the logical next step. They seemed to have skipped that and are directly writing the program as ASIC (ROM). Pretty wild.
killbot5000
The foundation models themselves will be cheap to deploy, but we’ll still need general purpose inferencing hardware to work along side them, converting latent intermediate layers to useful, application-specific concerns. This may level off the demand for “gpu/tpu” hardware, though, by letting the biggest and most expensive layers move to silicon.
choilive
I speculate that they are hitting the reticle limit for models not much bigger than this. Judging by the size of the chip in their demonstrator for a 8B model I'm sure they know this already. To scale this up means splitting up large models into multiple chips (layer or tensor parallelism). And that gets quite complicated quite quickly and you'll need really high bandwdith/low latency interconnects. Still a REALLY interesting approach with a ton of potential despite the unstated challenges.
jnaina
Hmm, isn't manufacturing the elephant in the room here. What am I missing. The HC1 is built on TSMC’s N6 process with an 815 mm² die. TSMC’s capacity is already heavily allocated to major customers such as NVIDIA, AMD, Apple, and Qualcomm. A startup cannot easily secure large wafer volumes because foundry allocation is typically driven by long term revenue commitments. the supply side cannot scale quickly. Building new foundry capacity takes many years. TSMC’s Arizona fab has been under development since 2021 and is still not producing at scale. Samsung’s Texas fab and Intel’s Ohio project face similar long timelines. Expanding semiconductor production requires massive construction, EUV equipment from ASML, yield tuning, and specialized workforce training. Even if demand for hardwired AI chips surged, the manufacturing ecosystem would take close to a decade to respond.