Mistral OCR 4
meetpateltech
449 points
115 comments
June 23, 2026
Related Discussions
Found 5 related stories in 125.5ms across 11,417 title embeddings via pgvector HNSW
- Mistral Small 4 pember · 56 pts · March 16, 2026 · 74% similar
- Mistral Medium 3.5 meetpateltech · 450 pts · April 29, 2026 · 64% similar
- Mistral AI acquires Emmi AI doener · 215 pts · May 19, 2026 · 61% similar
- Notes from the Mistral AI Now Summit vnglst · 331 pts · May 29, 2026 · 60% similar
- Show HN: Online OCR Free – Batch OCR UI for Tesseract, Gemini and OpenRouter naimurhasanrwd · 13 pts · March 03, 2026 · 59% similar
Discussion Highlights (20 comments)
Ducki
I was processing 55 year old paper files, most of them severely degraded, with its predecessor model. I was very impressed! I also tried Abbyy Finereader but it didn't even come close in my experience.
jppope
Is there something wrong with their certificate? Chromium is saying https isn't valid
mdrzn
It'll be interesting to see how this ranks against https://github.com/baidu/Unlimited-OCR
ge96
1000 pages for $4? damn how does it compare to llama parse I wonder
tdubey
Are there benchmarks for how this performs on charts, or maybe more accurately, plots? I've yet to find a model that can digitize a plot into X,Y points with some accuracy in my use case of digitizing old datasheets.
utopiah
" A note on out-of-scope use. OCR 4 is a document-understanding model, not a decision-maker. It is not intended for medical diagnosis, legal advice or judgment, high-stakes financial decisions, safety-critical systems, real-time/latency-sensitive processing, or non-document inputs (raw audio, video, etc.). " Can't wait for the "oh so innovative" manager who will suggest during the next meeting "Ok... but what if WE used it for high-stakes financial decisions on non-document inputs like a photo from my phone?" I guarantee you somebody on HN is going to comment about this "idea" next week.
gpm
Do these models (this one or its competitors) do handwriting recognition?
Insanity
Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”. Should have probably tried a more OCR specific model
stri8ted
Way too expensive. Google vision OCR (which they failed to compare against), is $1.50 per 1k pages. Vs $4 from Mistral.
pmxi
This has been a niche where Mistral has actually been successful. Btw, Hindi and Japanese are bucketed in "Rare Languages," which is odd.
greenleafone7
After paying for Mistral and using it for a while I genuinely hated it. It's a productivity black hole and can't realistically compete with anyone. I chose it only because it was European, but no. I'd rather let my one year subscription go to waste than use anything 'Mistral'.
mcbetz
Little on differences other than bounding boxes and double the price compared to their previous OCR v3 model from December - https://mistral.ai/news/mistral-ocr-3/ - other benchmarks were used back then.
MostlyStable
Does anyone know of OCR benchmarks that include hand-written documents? I'm currently using Gemini pro 3 for this, and error rates are quite good, but it's a little bit pricey, and I'd be interested in a cheaper model that could perform as well, but almost all the OCR benchmarks I'm aware of (and I believe all the ones included in this announcement) are about printed/typeset text.
andrewmutz
A tangential observation: the video on the linked page wasn't what I expected. I thought Mistral was a european AI company, so I didnt expect the video to be filmed in San Francisco featuring three people who don't seem to be european. I'm not against them being a global organization, that's wonderful. I was just surprised. I expected a parisian office and european accents.
mrkn1
This runs for free on CPU https://github.com/kouhxp/textsnap
coulix
I wonder how it does compare to reducto, pulse, extendai.
themanmaran
It's cheap at $4/1k, but I'm hesitant to even benchmark this one again since the previous versions were all "98% accurate based on internal benchmarks of 4 pdfs" and ended up falling short of almost everything else on the market [1]. Even in this one, they just report that OlmOCRBench and OmniDocBench have "known limitations" and that's why they report flagship numbers from their internal benchmark. https://getomni.ai/blog/benchmarking-open-source-models-for-...
v3ss0n
Not opensource right?
bastawhiz
The comparisons rank it against GPT and Gemini but not Claude. Is Claude's vision support simply not competitive when it comes to OCR tasks?
Ninjinka
Is there a complete list of the languages they support, and benchmarks by language, instead of just "Rare Languages"?