Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Anon84 138 points 59 comments June 09, 2026

Discussion Highlights (20 comments)

sys_64738

Surely 'strings' would be even better?

yodon

Feels important, but I wish they also had compared against something like MeiliSearch or Algolia.

piker

I recently watched the new Palantir + Kirkland & Ellis fund formation platform demo, and I was surprised to see how effective the union of structured data was in an agent harness. We're used to dealing with flat files and comparing here basic ways of searching, essentially, long strings, but using Palantir's "Ontology" graph framework, I think Kirkland is going to be able to achieve some exception and differentiating outcomes in legal tech. The whole idea assumes that they've got great structured data already, and perhaps that's the real valuable unknown, but giving an agent those tools is super powerful. I wrote about it[1] and came away with a different view on both Palantir and the future of agentic workflows personally. [1] sorry, LinkedIn: https://www.linkedin.com/pulse/fund-managements-killer-app-d...

hmokiguess

Tangential, I have a hook that rewriters grep to rg but lately I wonder if this is actually wasteful as the model is so biased to grep, is there a way to shim/alias perhaps?

gbacon

This is a surprising result. With structured inputs like source code, I’d expect grep to outperform semantic search, but natural language’s errors and inconsistencies seem to leave so many cracks for information to fall through.

jeffchuber

If you are truly bitter-lesson pilled - give the agent all the tools and let it decide which to use. - regex (grep) - hybrid search (bm25+vector) this X vs Y is uninteresting when the answer can be both.

kwillets

I'm curious to see what patterns it's grepping.

greenavocado

This has been posted before, but a dead-simple pattern that helps enormously with steering the model to the right code area is a DESIGN.md that it creates, updates, and references periodically.

alexrigler

Combining regex filtering with semantic ranking using multi-vector embeddings has yielded good results for me. I use ColGREP from the LightOn team asa daily driver - https://github.com/lightonai/next-plaid/blob/main/colgrep/RE...

piekvorst

I have always used traditional grep to search codebases. It serves me better than an IDE when there’re lots of scattered and frequent queries. grep’s design is surprisingly winning, exceeding expectations to this day.

quinncom

Don’t presume this study has anything to do with programming. They measured an agent’s ability to search long conversations, not code. > We evaluate on a 116-question representative subset of the LongMemEval benchmark (Wu et al., 2025), which tests an agent’s ability to answer questions over long conversations spanning multiple sessions.

liminal

Is <blank> the only ML paper title?

stephantul

This paper oversells on the title. Like, what is chronos, which embedding model was used, which reranker, how was the reranking done, why is chronos much better than claude code

softwaredoug

In my research grep is fine if you don’t care about tokens and you have less than 100k files. The direct corpus interaction paper [1] shows a breakdown past this level. In my personal experience you get a bit better relevance than a BM25 search engine with grep plus an agent. But it requires you to eat tokens. If you think grep is great, it’s because you’ve been social engineered to organize your content to be findable. We document why something is useful to an agent. We put it in a logical place. Just organizing content is at least half of building search, agentic or not. It’s one reason Google is successful, we’re all trying to make our content findable by the search engine. It’s not all technology :) 1- https://arxiv.org/abs/2605.05242

contextfree

It seems ridiculous that, for example, Copilot running in Visual Studio working on a C# codebase finds stuff in code by grepping around instead of using the Roslyn-driven code symbol and semantic database built into Visual Studio. I'm guessing it's because the people they get to work on AI stuff are AI People who probably only write in Python

SkyPuncher

Table 2 and 3 tell you basically all you need to know. When you use a harness that is tuned towards programing (Codex and Claude Code), grep wins. When you use a neutral harness, vector search wins. So far every Grep vs RAG discussion I've seen conflates overlapping factors. The most common is simply that a company rebuilt their pipeline from scratch and fixed a bunch of problems. The worst is when they go from one-shot RAG to multi-step Grep and completely miss the fact that multi-step RAG would likely get them similar results. At the end of the day, the most important thing is knowing the _product features_ your users care about and making sure that's represented in the pipeline.

_pdp_

If grep were enough, SQLite wouldn't exist.

yetanotherjosh

From the article: > LongMemEval rewards recovering literal witnesses: exact dates, counts, preferences, and spans that often remain stable under tokenization. Is this saying they chose a benchmark that is biased towards doing well against literal string matching, thus works well with grep, and then (gasp) showed that grep did well, finally declaring "grep is all you need"? The examples in the benchmark's demo image(1) are all examples you could see grep doing well on. A conversation about bikes, then a query about bike(s) where "bike" is a common token hit. But not stuff like a conversation about a Beethoven sonata, then a question about classical music, where embedding based approach would shine. (1) https://github.com/xiaowu0162/LongMemEval/blob/main/assets/l...

0xbadcafebee

> grep generally yields higher accuracy And a lot more tokens, and slower speed. Yes you can get more accuracy if you suck tons more data into context. But compare this to more advanced code agent methods like Tree Sitter, PageRank, LSP, that build semantic maps to provide more relevant context. Grep alone can't do that

yanhangyhy

i recently switched to https://github.com/dmtrKovalenko/fff . but i haven't notice any big diffrence yet..

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Discussion Highlights (20 comments)

Related Discussions