Mistral Small 4

pember 56 points 5 comments March 16, 2026

Discussion Highlights (5 comments)

2001zhaozhao

Which Haiku model are they comparing to? Is it 4.5? In which case it's absolutely wild that Qwen3.5 122B is shredding it in those graphs

adt

https://lifearchitect.ai/models-table/

kristianp

Interesting that they target around 120 billion parameters. Just enough to fit onto a single H100 with 4 bit quant. Or 128GB APU like apple silicon, AMD AI cpus or the GB spark. Copying GPT-OSS-120b? Available to try at https://build.nvidia.com/mistralai/mistral-small-4-119b-2603

zacksiri

I tested the model in an agentic workflow. Here is the report: https://upmaru.com/llm-tests/simple-tama-agentic-workflow-q1...

revolvingthrow

I really wish the benchmarks were even slightly trustworthy for AI models. ~120B are the largest models I can run locally. Naturally I grabbed the 122B Qwen3.5, which had great benchmarks and… frankly, the model is garbage, worse than glm air 4.5 IMO. But then, qwen famously benchmaxxes. And here we have another release. The benchmarks are just a tiny bit worse than qwen3.5 (for far less tokens). Am I to take it that the model is worse? Or does qwen’s benchmaxxing mean that slightly worse result of non-qwen models means a better model? I’d rather not spend hours testing things myself for every noteworthy release. Ah well. Mistral has been fairly decent so worth taking a look. Obviously they’re behind the big 3, but in my experience their small models are probably the best you can get for several months after each release. I’m not sure how it works as a sales funnel for their paid models, same as with chinese models - people likely just go for google/openai/anthropic in this case - but I’m thankful for their existence.

Mistral Small 4

Discussion Highlights (5 comments)

Related Discussions