Everything in C is undefined behavior
lycopodiopsida
481 points
630 comments
May 20, 2026
Related Discussions
Found 5 related stories in 82.2ms across 8,303 title embeddings via pgvector HNSW
- No way to parse integers in C (2022) konmok · 74 pts · May 20, 2026 · 48% similar
- Unsigned sizes: A five year mistake lerno · 77 pts · May 02, 2026 · 47% similar
- Why does C have the best file API maurycyz · 95 pts · March 01, 2026 · 45% similar
- The Little Book of C ghostrss · 65 pts · March 26, 2026 · 44% similar
- While the King Lives: An Old C Programming Prank in GNU Hello from 1993 bananamogul · 24 pts · May 05, 2026 · 43% similar
Discussion Highlights (19 comments)
dmitrygr
I stoped reading about here: > bool parse_packet(const uint8_t* bytes) { > const int* magic_intp = (const int*)bytes; // UB! Author, if you are reading this, please cite the spec section explaining that this is UB. Dereferencing the produced pointer may be UB, but casting itself is not, since uint8_t is ~ char and char* can be cast to and from any type. you might try to argue that uint8_t is not necessarily char, and while it is true that implementations of C can exist where CHAR_BIT > 8, but those do not have uint8_t defined (as per spec), so if you have uint8_t, then it is "unsigned char", which makes this cast perfectly safe and defined as far as i can tell. Of course CHAR_BIT is required to be >= 8, so if it is not >8, it is exactly 8. (In any case, whether uint8_t is literally a typedef of unsigned char is implementation-defined and not actually relevant to whether the cast itself is valid -- it is)
weinzierl
A fun one that'd fit list be sequence point violations like i = i++
veltas
From the ANSI C standard: 3.16 undefined behavior: Behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message). Is it just me or did compiler writers apply overly legalistic interpretation to the "no requirements" part in this paragraph? The intent here is extremely clear, that undefined behavior means you're doing something not intended or specified by the language, but that the consequence of this should be somewhat bounded or as expected for the target machine. This is closer to our old school understanding of UB. By 'bounded', this obviously ignores the security consequences of e.g. buffer overflows, but just because UB can be exploited doesn't mean it's appropriate for e.g. the compiler to exploit it too, that clearly violates the intent of this paragraph.
stackghost
Anyone who uses the construction "C/C++" doesn't write modern C++, and probably isn't very familiar with the recent revisions despite TFA's claims of writing it every day for decades. Far from being just "C with classes", modern C++ is very different than C. The language is huge and complex, for sure, but nobody is forced to use all of it. No HN comment can possibly cover all the use cases of C++ but in general, unless you have a very good reason not to: - eschewing boomer loops in favor of ranges - using RAII with smart pointers - move semantics - using STL containers instead of raw arrays - borrowing using spans and string views These things go a long way towards, shall we say, "safe-ish" code without UB. It is not memory-safe enforced at the language level, like Rust, but the upshot is you never need to deal with the Rust community :^)
bestouff
The problem of UB is not really that it may crash in some architecture. The real problem is that the compiler expects UB code to NOT happen, so if you write UB code anyway the compiler (and especially the optimizer) is allowed to translate that to anything that's convenient for its happy path. And sometimes that "anything" can be really unexpected (like removing big chunks of code).
raluk
In C / C++ there are two kinds of undefined behaviour. One is where there is written in standard what UB is. Another one is everthing else that is not in standard.
__0x01
> A problem with this is that in order to confirm the findings, you’ll need an expert human. But generally expert humans are busy doing other things. The article suggests using LLMs to identify and fix UB. However as per the above, I think the issue is that we need more expert humans. LLM generated code will eventually contain UB. EDIT: added "eventually"
jraph
Yet another push to use LLMs after casting fear. Now it should be illegal not to use LLMs. A good start of the day. (I hope casting fear is not UB)
momo26
Debugging in C is soooo hard. When I was writing Malloc Lab in system course, there were uncountable undefined and out of range :(
logicchains
The concept of undefined behaviour is also a very useful lens for understanding LLM-based coding. Anything you don't explicitly specify is undefined behavior, so if you don't want the LLM to potentially pick a ridiculous implementation for some aspect of an application, make sure to explicitly specify how it should be implemented.
cracki
We know. This is not news.
liamd1988
When use C ,keep using char* not mess with int*
fithisux
UB can also have impact in logical cohesion of codebase.
my-next-account
Hello, it's me. I'm not afraid of UB.
debugnik
As much as I agree with the intro, these examples aren't good and the overall article is just a veil for pushing LLM coding.
greysphere
The examples aren't really undefined behavior. They are examples that could become UB based on input/circumstances. Which if you are going to be that generous, every function call is UB because it could exceed stack space. Which is basically true in any language (up to the equivalent def of UB in that language). I feel like c has enough actual rough edges that deserve attention that sensationalism like this muddies folks attention (particularly novices) and can end up doing more harm than good.
quelsolaar
The 5 stages of learning about UB in C: -Denial: "I know what signed overflow does on my machine." -Anger: "This compiler is trash! why doesn't it just do what I say!?" -Bargaining: "I'm submitting this proposal to wg14 to fix C..." -Depression: "Can you rely on C code for anything?" -Acceptance: "Just dont write UB."
rurban
Very bad advice. Of course good new LLM's know about UB, but you still need to use ubsan (ie - fsanitize=undefined), and not your LLM.
VimEscapeArtist
Wait until he discovers PowerShell ;D