VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

timhigins 97 points 31 comments June 23, 2026

Discussion Highlights (7 comments)

aero2146

I tried generating the classic pelican svg, but it failed horribly just showing me a rectangle and a black circle...

noperator

Having some success while testing this model out as a replacement for GPT-5 nano in source code security review. Running on RTX 3090 (24 GB VRAM) via vLLM. It's not great on structured output (as noted in the model card) but I'm working around that in my harness.

gslepak

Note that these are Python-only results, the model will not do as well with other languages. I'm glad to see more domain-focused SLMs, we need more of them! A programming focused MoE should work well across many languages.

deftio

There is some base level of intelligence any model needs to be useful, even in narrow tasks. Could you teach a 5 year old to drive a car? A 10 year old? A 12 year old? To drive a car requires being able to read, to have judgement about ice or rainy conditions, to anticipate a child running after a ball. By the time a human in in their mid teens they have acquired the base knowledge... Small models need to have enough base knowledge to be able to be good enough -- even in a seemingly narrow regime. Where is that? Obviously they don't need all the obscure knowledge of a frontier model but there is some base level which is probably more than it would first seem.

SwellJoe

It's terrible at hunting security bugs (I expected it to be, but I wanted to be sure). I added it to a benchmark I made with a corpus of some Mythos-discovered bugs, and it found zero. The smallest pretty successful models remain Qwen 3.6 and Gemma 4 (but I haven't tested the very small variants of those yet). https://swelljoe.com/post/will-it-mythos/

secretslol

Am I right in thinking this is a tiny model which has been trained well to reason, and that's it? Makes me think of a smart person who doesn't know anything about a given topic, but with the right tools will go and research the heck out of it. I really like the sound of this... why have models train on learning anything when you can just train them how to learn and let them get on with it from something as small as a Pi Zero and an internet connection.

NotSuspicious

The interesting thing about models this small is they should be able to be put on a single Taalas chip (the HC1 already runs a Llama 3.1 8B model). We're already at the point where half-decent reasoning could be run on an ASIC (and at mind-boggling speeds).

VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

Discussion Highlights (7 comments)

Related Discussions