Open Code Review – An AI-powered code review CLI tool

geoffbp 121 points 28 comments June 05, 2026
github.com · View on Hacker News

Discussion Highlights (8 comments)

singingtoday

I'm interested in trying this. We have our own internal automated review which has shown positive results, but I would love to drop it if I find something better. Code review is currently our bottleneck, so any possibility of better automating it is welcome.

atestu

We've been using Coderabbit, great deal ($30/mo/dev flat) and finds a lot. I also built a skill I call `/meta-review` that asks Codex, Cursor, and Gemini to review the code (I use Claude Code). It always finds little things claude & I missed. Coderabbit just came out with their own PR review UI that's great for big PRs, it groups files together etc. https://www.coderabbit.ai/blog/introducing-atlas-the-first-a...

faangguyindia

If you've codex what does it add over codex's default app? I am confused. Can't you simply ask codex in another tab to just do a code review?

elpakal

At a kill s@@s hackathon at work, I was able to build something that uses a node image installs claude code runs a /review-like command puts inline comments to PR deletes old comments when rerunning OCR seems cool, but overkill, and I'm definitely not using Code Rabbit after their CEO was on here acting snobbish a while back. Point being AI code review in Git** itself isn't hard to do and can add a lot of value quickly.

causal

I recently moved off Cursor's BugBot because it's no longer a flat $40, and I feel a little lost trying to find a viable alternative because there are so many and the pricing kind of sucks for all of them. Curious if anyone has a recommendation.

eranation

I wonder how they do against this benchmark (not that I vetted this benchmark... but still interesting to know...) https://codereview.withmartian.com

weird-eye-issue

> After installation, the ocr command is available globally. Wish they chose a different acronym...

eranation

Ran it on a subset of 10 of the 50 PRs in this benchmark https://codereview.withmartian.com - very good recall (~74%, e.g. found a lot of the golden issues) - not so good precision (~12%, e.g. lots of false positives) - the precision causes the F1 to tank (~20%, if this stays the same on the full 50 sample it would puts it almost last, even less than Kilo+Grok)

Semantic search powered by Rivestack pgvector
10,002 stories · 93,925 chunks indexed