Why are large language models so terrible at video games?
sxx0
29 points
54 comments
June 01, 2026
Related Discussions
Found 5 related stories in 112.8ms across 10,500 title embeddings via pgvector HNSW
- Top AI models underperform in languages other than English Brajeshwar · 19 pts · March 19, 2026 · 59% similar
- Artificial intelligence-associated delusions and large language models beardyw · 12 pts · March 14, 2026 · 57% similar
- Hallucination Is Inevitable: An Innate Limitation of Large Language Models drob518 · 12 pts · May 04, 2026 · 55% similar
- Different language models learn similar number representations Anon84 · 94 pts · April 24, 2026 · 53% similar
- Language model teams as distributed systems jryio · 87 pts · March 16, 2026 · 51% similar
Discussion Highlights (19 comments)
danaris
> This brings us to what seems like a contradiction. LLMs are bad at playing games. Yet at the same time, they’re improving rapidly at coding, a skill set that can be used to create a game. How do these facts fit together? > Togelius: It’s super weird. ...No, it's really not. They're language models. Code is a language. "Playing a game well" is not. One can, hypothetically, encode game inputs in such a way that it seems kinda-sorta like a language, but it has none of the same kinds of structures that languages—both human and programming—do. The only way one can think this is strange is if one thinks of LLMs' ability to code rudimentary games as being due to a deeper understanding of games, rather than due to game code being well-represented in their training data.
jiehong
Video games are made to entertain humans, so does it really matter whether LLMs are good at playing them?
cultofmetatron
cough JEPA cough
voidUpdate
Its almost like the Large Language Model has trouble with things that arent Language, such as realtime controller input and video output from a game
panarchy
I actually really miss all the research being done on having (reinforcement learning) AIs beat Atari games and the like. Or the one that stopped at a TV playing random images instead of continuing through the level. Has there been any progress in that field? It seems like LLMs came around and all the projects stopped completely.
nottorp
There was good progress in training neural networks to play video games. Unfortunately it doesn't seem to fit in some people's context because it was a few years ago. Kind reminder: there is "AI" beyond LLMs.
andunie
I wonder if they would be good at text-based games.
ThunderSizzle
I wonder if you paired a few different types of AI together, an LLM agent might be good at strategizing -. E.g. building a strategy on how to handle a scenario. But, it would need to know the entire game manual basically. Then it would pass the stratrgy to a better AI in some way. But it might not be needed if the better gaming AI can just do that part too already. I admit I know nothing about this though.
ceheaaf
It feels like they're really focusing on overstating how confusing and weird it is that an LLM can write code but not play games very well, rather than just explaining it. Code is text. LLMs are text input/output machines. Game input/output is not at all text. LLMs can certainly reason about games with a simple/explicit enough domain (try a risk tournament where models can talk to each other between turns!)
dsabanin
Why is a language model bad at video games? I think the answer is stated in the question itself.
jagged-chisel
Because they’re large language models. Language doesn’t map onto gameplay. Choose another “AI” technology and give another go.
Zobat
As others have hinted at LLMs aren't really made in a way that makes them likely to play video games (CS/Halo and such) well. I wonder how they'd fare "against" text based adventures like Zork (which they'll no doubt have ample knowledge about) and newer text based adventure games (which they'll know less about).
deyiao
I guess the author’s point is that LLMs can’t really learn in real time yet, whereas playing games is basically all about real-time learning. So an LLM can be very good at writing code, but still be terrible at actually playing games. Personally, I think this is a really hard problem, and it may turn out to be one of the first big walls we hit on the road to AGI.
meffmadd
I found LLMs to be surprisingly good at puzzle games like Baba Is You: https://meffmadd.github.io/samplesurium/posts/baba_is_agent/
suyavuz
The coding comparison is more interesting to me. Programming has unusually good feedback loops. A test fails, an exception gets thrown, a benchmark regresses. Most games don't give you that kind of signal. I wonder how much of current coding performance depends on that.
pmontra
I don't know what to save from this article. Maybe only "[LLMs are] very bad at spatial reasoning. Which shouldn’t be surprising, because that’s also not in the training data."
nickcageinacage
Maybe LLMs should stay away from the arts
aabdi
Isn’t this more of a “we didn’t rl the model to do games so it can’t do it?” Something like snake or tic-tac-toe is straightforward.
josefritzishere
LLMs are terrible at a lot of things, and mediocre at most things. What those things have in common with each other is interesting though.