Bringing Up DeepSeek-V4-Flash on AMD MI300X
kkm
99 points
11 comments
June 02, 2026
Related Discussions
Found 5 related stories in 96.1ms across 9,294 title embeddings via pgvector HNSW
- DeepSeek 4 Flash local inference engine for Metal tamnd · 347 pts · May 07, 2026 · 63% similar
- DS4, a specialized inference engine for DeepSeek v4 Flash tosh · 18 pts · May 07, 2026 · 61% similar
- DeepSeek v4 impact_sy · 455 pts · April 24, 2026 · 59% similar
- DeepSeek-V4-Flash means LLM steering is interesting again Brajeshwar · 223 pts · May 16, 2026 · 59% similar
- DeepSeek-V4 Technical Report [pdf] tianyicui · 19 pts · April 24, 2026 · 57% similar
Discussion Highlights (6 comments)
benlm
Nice work! Would DeepSeek V4 Pro on 8xMI300X work with these patches?
kkm
Also the vllm patch accompanying the blogpost: https://github.com/doublewordai/vllm-amd-blog-doubleword
mezark
We at doubleword are bullish for AMD for low-interactivity inference - it does just take a bigger lift on the software side...
maCDzP
I train on AMD MI250X and managed to get Gemma 4 31B to work - but it took a lot of work on the software side.
latchkey
Nice work and thanks for being a customer. (CEO Hot Aisle)
edg5000
Checked out this company about a year ago and they only offered small models. Now I see they have GLM-fp8/Kimi and DeepSeek V4 Pro. Since workloads are predominantly cached input, I'm surprised to see no separate price for cached input vs uncached. I hope the prices will drop significantly; with these prices you'll end up with thousands in monthly costs quickly. Hopefully more hardware companies will be on the market in the coming years. If the Chinese eventually start competing with the current memory makers, maybe that will help.