Deterministic Fully-Static Whole-Binary Translation Without Heuristics

matt_d 63 points 6 comments May 13, 2026
arxiv.org · View on Hacker News

Discussion Highlights (2 comments)

dmitrygr

Cute, but Rice's theorem remains, and while they translated every byte as code, still no handling is possible for char buf[] = {0xB8, 0x2A, 0x00, 0x00, 0x00, 0xC3}; return ((int (*)(void))buf)(); static translation is only possible when you assume no adversarial code AND mostly assume compiler-produced binaries. hand-rolled asm gets hard, and adversarial code is provably unsolvable in all cases. still, pretty cool for cooperative binaries

jonhohle

This is neat. I haven’t looked into it, but I would think relative offsets could still be an issue, but it seems there must be some translation layer/mmu since the codegen will be different sizes anyway. This would impact jump tables and internal branches, primarily. I mostly work on stuff from the 90s, but disassemblers make a lot of assumptions about where code starts and ends, but occasionally a binary blob is not discoverable unless you have some prior knowledge (pointer at a fixed location to an entry point). I would think after a few passes you could refine the binary into areas that are definitely code.

Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed