Kimi K2.6: Advancing open-source coding
meetpateltech
628 points
331 comments
April 20, 2026
Related Discussions
Found 5 related stories in 68.4ms across 5,126 title embeddings via pgvector HNSW
- Kimi K2.6: Advancing Open-Source Coding nekofneko · 39 pts · April 20, 2026 · 93% similar
- Kimi K2.6-code-preview is now available jrop · 12 pts · April 13, 2026 · 81% similar
- Kimi K2.6 kbumsik · 11 pts · April 20, 2026 · 70% similar
- Cursor Composer 2 is just Kimi K2.5 with RL mirzap · 260 pts · March 20, 2026 · 57% similar
- Kiro CLI 2.0 salkahfi · 13 pts · April 13, 2026 · 54% similar
Discussion Highlights (20 comments)
irthomasthomas
Beats opus 4.6! They missed claiming the frontier by a few days.
nickandbro
Wow, if the benchmarks checkout with the vibes, this could almost be like a Deepseek moment with Chinese AI now being neck and neck with SOTA US lab made models
swingboy
Exciting benchmarks if true. What kind of hardware do they typically run these benchmarks on? Apologies if my terminology is off, but I assume they're using an unquantized version that wouldn't run on even the beefiest MacBook?
esafak
K2.5 was already pretty decent so I would try this. Starting at $15/month: https://www.kimi.com/membership/pricing edit: Note that you can run it yourself with sufficient resources (e.g., companies), or access it from other providers too: https://openrouter.ai/moonshotai/kimi-k2.6/providers
lbreakjai
I have a subscription through work, I've been trialing it, so far it looks on par, if not better, than opus.
verdverm
https://huggingface.co/moonshotai/Kimi-K2.6 Is this the same model? Unsloth quants: https://huggingface.co/unsloth/Kimi-K2.6-GGUF (work in progress, no gguf files yet, header message saying as much)
pt9567
wow - $0.95 input/$4 output. If its anywhere near opus 4.6 that's incredible.
greenavocado
I pray the benchmark figures are true so I can stop paying Anthropic after screwing me over this quarter by dumbing down their models, making usage quotas ridiculously small, and demanding KYC paperwork.
elfbargpt
I've always been surprised Kimi doesn't get more attention than it does. It's always stood out to me in terms of creativity, quality... has been my favorite model for awhile (but I'm far from an authority)
game_the0ry
There is some humor in the fact that china (of all countries) is pioneering possibly the world's most important tech via open source, while we (US) are doing the exact opposite.
nisegami
The choice of example task for Long-Horizon Coding is a bit spooky if you squint, since it's nearing the territory of LLMs improving themselves.
Banditoz
If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam ( https://agi.safe.ai/ ) this model uses and I can't seem to access it.
mariopt
Really excited to try this one, I've been using kimi 2.5 for design and it's really good but borderline useless on backend/advanced tasks. Also discovered that using OpenCode instead of the kimi cli, really hurts the model performance (2.5).
oliver236
isnt this better than qwen?
simonw
Accessed via OpenRouter, this one decided to wrap the SVG pelican in HTML with controls for the animation speed: https://gisthost.github.io/?ecaad98efe0f747e27bc0e0ebc669e94... Transcript and HTML here: https://gist.github.com/simonw/ecaad98efe0f747e27bc0e0ebc669...
dmix
I'm pretty Kimi is what Cursor uses for their "composer 2" model. Works pretty good as a fallback when Claude runs out, but definitely a downgrade.
cassianoleal
If only their API wasn't tied to a Google or phone login...
cmrdporcupine
Running it through opencode to their API and... it definitely seems like it's "overthinking" -- watching the thought process, it's been going for pages and pages and pages diagnosing and "thinking" things through... without doing anything. Sitting at 50k+ output tokens used now just going in thought circles, complete analysis paralysis. Might be a configuration or prompt issue. I guess I'll wait and see, but I can't get use out of this now.
m4rkuskk
I have been testing it in my app all morning, and the results line up with 4.6 Sonnet. This is just a "vibe" feeling with no real testing. I'm glad we have some real competition to the "frontier" models.
XCSme
(commented on the wrong thread, HN doesn't let me delete it :( )