How Unicode Collation Works (2025)
theobeers
12 points
4 comments
April 30, 2026
Related Discussions
Found 5 related stories in 79.7ms across 8,303 title embeddings via pgvector HNSW
- Why Don't Lowercase Letters Come Right After Uppercase Letters in ASCII? alpaylan · 12 pts · May 07, 2026 · 42% similar
- Charcuterie – Visual similarity Unicode explorer rickcarlino · 181 pts · April 09, 2026 · 40% similar
- Show HN: Unicode Steganography PatrickVuscan · 22 pts · April 07, 2026 · 40% similar
- Using Claude Code: The unreasonable effectiveness of HTML pretext · 31 pts · May 09, 2026 · 40% similar
- Email obfuscation: What works in 2026? jaden · 23 pts · April 02, 2026 · 37% similar
Discussion Highlights (1 comments)
theobeers
Submission statement: Figuring out how to develop a Unicode collator from scratch for a research group that I working with in Berlin was one of my formative experiences as a programmer. Ever since then, I've wanted to write something to collect my thoughts on the Unicode Collation Algorithm and the process of building a conformant implementation. Last summer I had a good excuse to do this, when I decided to adapt my collator to Zig as a way of learning that language. The Unicode standards, and the (relatively) low-level software libraries based on them, do a lot of things for us to make computing possible. We have the luxury of not needing to worry about most of those things most of the time. I find it humbling whenever I do peek under the hood.