Exploring the internal representations of Pangram 3.3.2
krackers
18 points
5 comments
June 25, 2026
Related Discussions
Found 5 related stories in 118.3ms across 11,536 title embeddings via pgvector HNSW
- LPeg – Parsing Expression Grammars for Lua tosh · 31 pts · June 02, 2026 · 46% similar
- Show HN: Ideogram 4.0 – open-weight 9.3B text-to-image model pigcat · 41 pts · June 03, 2026 · 45% similar
- Show HN: A plain-text cognitive architecture for Claude Code marciopuga · 65 pts · March 25, 2026 · 42% similar
- Introduction to Beaver Triples badcryptobitch · 22 pts · May 09, 2026 · 41% similar
- Human Routers of Machine Words zx321 · 55 pts · June 13, 2026 · 41% similar
Discussion Highlights (3 comments)
Chu4eeno
I wonder if they had enough material from individual humans if they could've distinguished between them as well? It really seems like their model is learning to recognize some general form of writer's "voice", so to speak (and I assume their final layer just knows which voices are supposed to be tagged as what).
saithound
I use Pangram quite extensively (burning through my 600 token allowance every month). They managed to get their false positive rate impressively low: if Pangram says something is 100% AI-written, you can trust that. But they need to improve their humanizer dataset. Right now, most models can be given system prompts which cause them to emit text classified as 100% human. It looks like their automated humanizers do worse than these system prompts. Or (alarming if so) they chose not to include ones that would make their product look unreliable.
jazzpush2
Hoping for a follow-up with Sparse Autoencoders.