A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly

monax 26 points 0 comments May 05, 2026
github.com · View on Hacker News
Semantic search powered by Rivestack pgvector
6,878 stories · 64,638 chunks indexed