768GB Intel Optane DIMMs to run 1T-parameter LLM with single GPU at 4tps
walterbell
26 points
2 comments
May 30, 2026
Related Discussions
Found 5 related stories in 99.8ms across 8,961 title embeddings via pgvector HNSW
- Real-time LLM Inference on Standard GPUs: 3k tokens/s per request NicoConstant · 202 pts · May 29, 2026 · 63% similar
- NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute sdpmas · 122 pts · March 19, 2026 · 58% similar
- Intel's make-or-break 18A process node debuts for data center with 288-core Xeon vanburen · 270 pts · March 03, 2026 · 58% similar
- MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU chrsw · 280 pts · April 08, 2026 · 58% similar
- $500 GPU outperforms Claude Sonnet on coding benchmarks yogthos · 142 pts · March 26, 2026 · 54% similar
Discussion Highlights (2 comments)
lostmsu
The bottleneck in this setup is PCIe bus. You don't need optane to saturate it. 4 regular SSDs might do just fine.
musicale
Ah Optane, what might have been... Even over PCIe, I imagine the advantage vs. NVMe is lower latency and more operations per second.