Parallel Parentheses Matching
Athas
75 points
11 comments
June 25, 2026
Related Discussions
Found 5 related stories in 118.1ms across 11,625 title embeddings via pgvector HNSW
- Parallel coding agents with tmux and Markdown specs schipperai · 139 pts · March 02, 2026 · 46% similar
- The Training Example Lie Bracket pb1729 · 26 pts · April 09, 2026 · 45% similar
- Data Parallel C++ tosh · 14 pts · May 28, 2026 · 44% similar
- Generalised plusequals leontrolski · 13 pts · April 24, 2026 · 44% similar
- Combinators tosh · 140 pts · March 31, 2026 · 42% similar
Discussion Highlights (6 comments)
pantsforbirds
Fun article and worth the read, but sadly none of the LaTeX was rendered for me (assuming it was supposed to).
raphlinus
Also see Fast GPU bounding boxes on tree-structured scenes[1] (unpublished paper) and notes toward a blog post[2]. This is a highly tuned GPU implementation of parentheses matching. It's actually used in Vello (the classic version in which we offload basically all the work to the GPU, not the newer CPU-GPU hybrid version in which tracking the blend stack is done on the CPU). Earlier versions of the work were featured on HN [3][4], but this is much more sophisticated. (plus a few more zero-comment submissions) The basic idea (bicyclic semigroup and binary search) is the same as the submission. I think earliest attribution is to Bar-On and Vishkin[5] from 1985. Another implementation of this idea is in pareas[6], an experimental GPU-accelerated compiler. I believe this work is publishable and would love to work with a student to resubmit it. Especially if you're a student or prof in Sydney, please reach out. [1]: https://arxiv.org/abs/2205.11659 [2]: https://github.com/raphlinus/raphlinus.github.io/issues/66 [3]: https://news.ycombinator.com/item?id=24385095 [4]: https://news.ycombinator.com/item?id=27164009 [5]: https://dl.acm.org/doi/10.1145/3318.3478 [6]: https://github.com/Snektron/pareas
solomonb
Getting to discover Oleg Kiselyov's work for the first time is such a treat. His web archive is incredible! I'm envious of the author and anyone else discovering it today. https://okmij.org/ftp/
dmkolobov
Nice. I remember being exposed to this idea as Dyck languages , via Ed Kmett’s recorded Monoidal Parsing talk.
ww520
This is an interesting read. You can solve the same problem with Range Min Query Tree. The query for balanced or unbalanced parentheses is O(log N). A two or three level RMQTree can represent billions of parentheses already (a two-level tree of 65536 branch factor = 4B parentheses). The query is O(65536 + 65536) or effectively O(1). For a four-level tree of 256 branch factor, the query is O(256 + 256 + 256 + 256) or O(1). It becomes a problem of memory access on the number of levels vs the number of entries to process per level. A total = 0 and min > 0 mean the parentheses are balanced. The parentheses can be broken up into multiple segments, and be processed in parallel. Multiple RMQTrees can be used, one per segment, and the trees can be processed in parallel. Need to have a final pass to sum up the 'total' and 'min' from all the RMQTrees. Parallel processing probably doesn't gain much.
cscheid
If you stare at the CYK algorithm long enough and see it for the dynamic programming approach it is, you'll then realize that you can do the same parallelization trick for any context-free grammar!