DeepSeek v4
impact_sy
455 points
183 comments
April 24, 2026
Related Discussions
Found 5 related stories in 61.8ms across 5,406 title embeddings via pgvector HNSW
- DeepSeek-V4 Technical Report [pdf] tianyicui · 19 pts · April 24, 2026 · 84% similar
- DeepSeek by Hand in Excel teleforce · 13 pts · March 18, 2026 · 72% similar
- DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence cmrdporcupine · 146 pts · April 24, 2026 · 71% similar
- Reducto releases Deep Extract raunakchowdhuri · 46 pts · April 06, 2026 · 48% similar
- SereneDB's C++ search engine is the fastest on search benchmarks gnusi · 31 pts · March 19, 2026 · 43% similar
Discussion Highlights (20 comments)
luyu_wu
For those who didn't check the page yet, it just links to the API docs being updated with the upcoming models, not the actual model release.
seanobannon
Weights available here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
nthypes
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main... Model was released and it's amazing. Frontier level (better than Opus 4.6) at a fraction of the cost.
taosx
MErge? https://news.ycombinator.com/item?id=47885014
gbnwl
I’m deeply interested and invested in the field but I could really use a support group for people burnt out from trying to keep up with everything. I feel like we’ve already long since passed the point where we need AI to help us keep up with advancements in AI.
jdeng
Excited that the long awaited v4 is finally out. But feel sad that it's not multimodal native.
fblp
There's something heartwarming about the developer docs being released before the flashy press release.
Aliabid94
MMLU-Pro: Gemini-3.1-Pro at 91.0 Opus-4.6 at 89.1 GPT-5.4, Kimi2.6, and DS-V4-Pro tied at 87.5 Pretty impressive
KaoruAoiShiho
SOTA MRCR (or would've been a few hours earlier... beaten by 5.5), I've long thought of this as the most important non-agentic benchmark, so this is especially impressive. Beats Opus 4.7 here
shafiemoji
I hope the update is an improvement. Losing 3.2 would be a real loss, it's excellent.
rvz
The paper is here: [0] Was expecting that the release would be this month [1], since everyone forgot about it and not reading the papers they were releasing and 7 days later here we have it. One of the key points of this model to look at is the optimization that DeepSeek made with the residual design of the neural network architecture of the LLM, which is manifold-constrained hyper-connections (mHC) which is from this paper [2], which makes this possible to efficiently train it, especially with its hybrid attention mechanism designed for this. There was not that much discussion around it some months ago here [3] about it but again this is a recommended read of the paper. I wouldn't trust the benchmarks directly, but would wait for others to try it for themselves to see if it matches the performance of frontier models. Either way, this is why Anthropic wants to ban open weight models and I cannot wait for the quantized versions to release momentarily. [0] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main... [1] https://news.ycombinator.com/item?id=47793880 [2] https://arxiv.org/abs/2512.24880 [3] https://news.ycombinator.com/item?id=46452172
jessepcc
At this point 'frontier model release' is a monthly cadence, Kimi 2.6 Claude 4.6 GPT 5.5, the interesting question is which evals will still be meaningful in 6 months.
swrrt
Any visualised benchmark/scoreboard for comparison between latest models? DeepSeek v4 and GPT-5.5 seems to be ground breaking.
raincole
History doesn't always repeat itself. But if it does, then in the following week we'll see DeepSeek4 floods every AI-related online space. Thousands of posts swearing how it's better than the latest models OpenAI/Anthropic/Google have but only costs pennies. Then a few weeks later it'll be forgotten by most.
ls612
How long does it usually take for folks to make smaller distills of these models? I really want to see how this will do when brought down to a size that will run on a Macbook.
zargon
The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision weights total approximately 154 GB. KV cache is said to take 10% as much space as V3. This looks very accessible for people running "large" local models. It's a nice follow up to the Gemma 4 and Qwen3.5 small local models.
frozenseven
Better link: https://news.ycombinator.com/item?id=47885014 https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
reenorap
Which version fits in a Mac Studio M3 Ultra 512 GB?
sidcool
Truly open source coming from China. This is heartwarming. I know if the potential ulterior motives.
namegulf
Is there a Quantized version of this?