A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly

monax 26 points 0 comments May 05, 2026
github.com · View on Hacker News
Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed