Show HN: Local-first fast CPU image to text for screenshots, PDFs, webpages

mrkn1 14 points 16 comments June 05, 2026
github.com · View on Hacker News

Discussion Highlights (8 comments)

garrett2558

Very cool, I'm building my own local-first product as well

abstract257

Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...

BIGFOOT_EXISTS

Now this is legit cool, keep up the great work.

vivzkestrel

- how well do you think this ll work with code? i mean take code screenshots and convert it into actual code for vscode

kouru225

Roman alphabet only or does this work with other alphabets?

monosma

What was the reason for adopting PaddleOCR? Can other OCR models be used as well?

KetoManx64

What's the performance like compared to tesseract? I don't see tesseract mentioned anywhere in the readme, which is surprising considering that's the number one tool most go to for Image > text OCR.

lavaman131

This is awesome! Been needing something like this for some research paper diagrams I've been indexing.

Semantic search powered by Rivestack pgvector
10,324 stories · 97,050 chunks indexed