Show HN: Mdarena – Benchmark your Claude.md against your own PRs
hudsongr
15 points
1 comment
April 05, 2026
Related Discussions
Found 5 related stories in 63.1ms across 4,562 title embeddings via pgvector HNSW
- Show HN: Claude's Code – tracking the 19M+ commits generated by Claude on GitHub phantomCupcake · 13 pts · March 24, 2026 · 60% similar
- Show HN: Claudraband – Claude Code for the Power User halfwhey · 103 pts · April 12, 2026 · 56% similar
- Show HN: Continual Learning with .md wenhan_zhou · 23 pts · April 13, 2026 · 56% similar
- Show HN: Claude-replay – A video-like player for Claude Code sessions es617 · 79 pts · March 06, 2026 · 54% similar
- Show HN: A playable version of the Claude Code Terraform destroy incident cdnsteve · 22 pts · March 10, 2026 · 52% similar
Discussion Highlights (1 comments)
hudsongr
Hey! I built this because everyone's writing CLAUDE.md files now but nobody knows if theirs actually works. The research is contradictory too, one paper says they hurt performance, another says they help. So I made a tool that just measures it on your own repo using your own PRs and your own test suite. Turns out it's not often you can point to a single markdown file and say "this made the agent 27% better at resolving real tasks." That's what we saw on our production monorepo. I imagine this as a way for teams to actually make their agents write better code instead of guessing.