Show HN: Filling PDF forms with AI using client-side tool calling

nip 51 points 23 comments May 02, 2026
copilot.simplepdf.com · View on Hacker News

Hey HN! I built SimplePDF Copilot: an AI assistant that can interact with the PDF editor. It fills fields, answers questions, focuses on a specific field, adds fields, deletes pages, and so on. It's built on top of SimplePDF that I started 7 years ago, pioneering privacy-respecting client-side pdf editing, now used monthly by 200k+ people. As for the privacy model: the PDF itself never leaves the browser. Parsing, rendering, and field detection all run client-side. The text the model needs (and your messages) goes to whatever LLM you point at. By default that's our demo proxy (DeepSeek V4 Flash, rate-capped), but you can BYOK and point it at any cloud provider, or go fully local (I've been testing with LM Studio). Unlike the existing "Chat with PDF" tools that only retrieve the text/OCR layer, Copilot can act on the PDF: filling fields, adding fields (detected client-side using CommonForms by Joe Barrow [1], jbarrow on HN with some post-processing heuristics I added on top), focusing on fields, deleting pages, and so on. I built this because SimplePDF is mostly used by healthcare customers where document privacy is paramount, and I wanted an AI experience that didn't require shipping PII to a third party. Stack is pretty standard: - Tanstack Start - AI SDK from Vercel - Tailwind (I personally prefer CSS modules, I'm old-school but the goal since I open source it, I figured that Tailwind would be a better fit) The more interesting part is the client-side tool calling: events are passed back and forth via iframe postMessage. If you're not familiar with "tool calling" and "client-side tool calling", a quick primer: Tool calling is what LLMs use to take actions. When Claude runs grep or ls, or hits an MCP server, those are tool calls. Client-side tool calling means the intent to call a tool comes from the LLM, but the execution happens in the browser. That matters for: speed, you can't go faster than client-to-client operations and also gives you the ability to limit the data you expose to the LLM. For the demo I do feed the content of the document to the LLM, but that connection could be severed as simply as removing the tool that exposes the content data. The demo is fully open source, available on Github [2] and the demo is the same as the link of this post [3] What's not open source is SimplePDF itself (loaded as the iframe). I could talk on and on about this, let me know if you have any questions, anything goes! [1] https://github.com/jbarrow/commonforms [2] https://github.com/SimplePDF/simplepdf-embed/tree/main/copil... [3] https://copilot.simplepdf.com/?share=a7d00ad073c75a75d493228...

Discussion Highlights (11 comments)

nip

Just to be clear, this is a technical demo showing what's possible with client-side tool calling + local models: LLM-assisted form filling where no document data has to leave the user's machine. Use cases range from: - Filling foreign-language forms - Navigating a contract before signing: "can I trust ALL the clauses here?" - Pre-filling repetitive forms from existing data sources (CRM, EHR, etc. via MCP/RAG) Copilot is designed to be embedded; our customers ship it white-labeled inside their own products.

iamflimflam1

Might be worth making it clearer that the chat messages are going to a remote server. So any PII data is leaving the local machine.

grahammccain

Keep going though. I’m definitely looking for something like this once we can get something secure we can use with proprietary and pii data.

kiney

Does it support XFA forms?

simianwords

It looks cool but, how is this different from me uploading to chatgpt and asking it to fill in?

nilirl

One thing I've struggled with before is building a collection of data models based off of a collection of PDF forms. I wanted to abstract away the PDF form building my own html form on top of a data model that can later be used to programmatically fill the PDF . Since I had 100s of PDFs, I wanted an OCR+LLM pipeline to build a data model for each PDF. Unfortunately, OCR + LLM works ~90% of the time but sometimes fields are missed or mislabeled in the data model. Does this sometimes get it wrong during programmatic filling? How do you deal with that?

tyingq

It is cool, but the demo is flawed, right at the second field: What's the business name/disregarded entity name, if different from above (line 2)? As far as I can tell, no way to skip this, leave it empty, not even "use a space". And that field would be empty for many or most.

grassfedgeek

In the chat box I typed my SSN is "123-45-6789". It filled it in in the wrong box (4 Exemptions). What problem is this solving? Isn't it easy enough to just click in the correct box and type the values? How does this compare to Claude Cowork?

BloondAndDoom

This looks interesting, I’m looking for a good personal PDF editor that I can use in windows and private. Seems like your product is more for organizations, any idea if such a thing exists. It seems like the market is full of bloated (Adobe/foxit) or not properly working editors.

kassenov

Do you think this will work with Chrome's built-in AI [1]? [1] https://developer.chrome.com/docs/ai/built-in

s09dfhks

I managed to do this locally with Claude and some python libraries. Claude looked over the PDF, found the fields, and wrote a python script to insert data at the appropriate locations. Sure it took some futzing to get everything to line up properly, but as other's have said, my PDF wasn't sent to a remote server

Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed