GPT‑5.4 Mini and Nano

meetpateltech 217 points 134 comments March 17, 2026

Discussion Highlights (20 comments)

machinecontrol

What's the practical advantage of using a mini or nano model versus the standard GPT model?

powera

I've been waiting for this update. For many "simple" LLM tasks, GPT-5-mini was sufficient 99% of the time. Hopefully these models will do even more and closer to 100% accuracy. The prices are up 2-4x compared to GPT-5-mini and nano. Were those models just loss leaders, or are these substantially larger/better?

HugoDias

According to their benchmarks, GPT 5.4 Nano > GPT-5-mini in most areas, but I'm noticing models are getting more expensive and not actually getting cheaper? GPT 5 mini: Input $0.25 / Output $2.00 GPT 5 nano: Input: $0.05 / Output $0.40 GPT 5.4 mini: Input $0.75 / Output $4.50 GPT 5.4 nano: Input $0.20 / Output $1.25

ryao

I will be impressed when they release the weights for these and older models as open source. Until then, this is not that interesting.

simianwords

why isn't nano available in codex? could be used for ingesting huge amount of logs and other such things

BoumTAC

To me, mini releases matter much more and better reflect the real progress than SOTA models. The frontier models have become so good that it's getting almost impossible to notice meaningful differences between them. Meanwhile, when a smaller / less powerful model releases a new version, the jump in quality is often massive, to the point where we can now use them 100% of the time in many cases. And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

cbg0

Based on the SWE-Bench it seems like 5.4 mini high is ~= GPT 5.4 low in terms of accuracy and price but the latency for mini is considerably higher at 254 seconds vs 171 seconds for GPT5.4. Probably a good option to run at lower effort levels to keep costs down for simpler tasks. Long context performance is also not great.

beklein

As a big Codex user, with many smaller requests, this one is the highlight: "In Codex, GPT‑5.4 mini is available across the Codex app, CLI, IDE extension and web. It uses only 30% of the GPT‑5.4 quota, letting developers quickly handle simpler coding tasks in Codex for about one-third the cost." + Subagents support will be huge.

miltonlost

Does it still help drive people to psychosis and murder and suicide? Where's the benchmark for that?

system2

I am feeling the version fatigue. I cannot deal with their incremental bs versions.

yomismoaqui

Not comparing with equivalent models from Anthropic or Google, interesting...

casey2

I googled all the testimonial names and they are all linked-in mouthpieces.

Tiberium

I checked the current speed over the API, and so far I'm very impressed. Of course models are usually not as loaded on the release day, but right now: - Older GPT-5 Mini is about 55-60 tokens/s on API normally, 115-120 t/s when used with service_tier="priority" (2x cost). - GPT-5.4 Mini averages about 180-190 t/s on API. Priority does nothing for it currently. - GPT-5.4 Nano is at about 200 t/s. To put this into perspective, Gemini 3 Flash is about 130 t/s on Gemini API and about 120 t/s on Vertex. This is raw tokens/s for all models, it doesn't exclude reasoning tokens, but I ran models with none/minimal effort where supported. And quick price comparisons: - Claude: Opus 4.6 is $5/$25, Sonnet 4.6 is $3/$15, Haiku 4.5 is $1/$5 - GPT: 5.4 is $2.5/$15 ($5/$22.5 for >200K context), 5.4 Mini is $0.75/$4.5, 5.4 Nano is $0.2/$1.25 - Gemini: 3.1 Pro is $2/$12 ($3/$18 for >200K context), 3 Flash is $0.5/$3, 3.1 Flash Lite is $0.25/$1.5

6thbit

Looking at the long context benchmark results for these, sounds like they are best fit for also mini-sized context windows. Is there any harness with an easy way to pick a model for a subagent based on the required context size the subagent may need?

bananamogul

They could call them something like “sonnet” and “haiki” maybe.

reconnecting

All three ChatGPT models (Instant, Thinking, and Pro) have a new knowledge cutoff of August 2025 . Seriously?

varispeed

I stopped paying attention to GPT-5.x releases, they seem to have been severely dumbed down.

pscanf

I quite like the GPT models when chatting with them (in fact, they're probably my favorites), but for agentic work I only had bad experiences with them. They're incredibly slow (via official API or openrouter), but most of all they seem not to understand the instructions that I give them. I'm sure I'm _holding them wrong_, in the sense that I'm not tailoring my prompt for them, but most other models don't have problem with the exact same prompt. Does anybody else have a similar experience?

dack

i want 5.4 nano to decide whether my prompt needs 5.4 xhigh and route to it automatically

kseniamorph

wow, not bad result on the computer use benchmark for the mini model. for example, Claude Sonnet 4.6 shows 72.5%, almost on par with GPT-5.4 mini (72.1%). but sonnet costs 4x more on input and 3x more on output

GPT‑5.4 Mini and Nano

Discussion Highlights (20 comments)

Related Discussions