Train Your Own LLM from Scratch

kristianpaul 52 points 7 comments May 05, 2026
github.com · View on Hacker News

Discussion Highlights (4 comments)

iamnotarobotman

This looks great for a first introduction to training LLMs, and it looks simple enough to try this locally. Great job!

jvican

If you're interested in this resource, I highly recommend checking out Stanford's CS336 class. It covers all this curriculum in a lot more depth, introduces you into a lot of theoretical aspects (scaling laws, intuitions) and systems thinking (kernel optimization/profiling). For this, you have to do the assignments, of course... https://cs336.stanford.edu/

baalimago

Train your LM from scratch* I doubt you have a machine big enough to make it "Large".

hiroakiaizawa

Nice. What scale does this realistically reach on a single machine?

Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed