
Share
As AI coding assistants become more integral, the debate over efficient code search intensifies. Vector search-powered RAG offers semantic insight, outpacing traditional grep methods in relevance and context.
Engineering
August 24, 2025
Cheney Zhang
The landscape of AI coding assistants has seen explosive growth over the past two years. Tools like Cursor, Claude Code, Gemini CLI, and Qwen Code have become indispensable to millions of developers. However, a critical debate is brewing: how should an AI coding assistant search your codebase for context?
There are two primary approaches:
Claude Code and Gemini have opted for the latter. A Claude engineer admitted on Hacker News that Claude Code doesn't use RAG at all; instead, it performs a line-by-line grep search (referred to as "agentic search"). This method lacks semantic understanding and structural context, relying solely on raw string matching.
Supporters of grep argue for its simplicity. They highlight that grep is fast, exact, and predictable-crucial qualities in programming where precision is paramount. Current embeddings are seen as too fuzzy to be trusted with critical tasks.
Critics of grep, however, see it as a dead end. Grep can flood you with irrelevant matches, consume excessive tokens, and slow down your workflow. Without semantic understanding, the AI is essentially debugging blindfolded.
After building and testing my own solution, I’ve found that vector search-based RAG significantly outperforms grep in several key areas:

I encountered these issues while debugging a complex problem. Claude Code executed grep queries across my repository, returning large chunks of irrelevant text. After one minute, I still hadn’t found the relevant file. Five minutes later, I finally had the right 10 lines, but they were buried in 500 lines of noise.
This isn't an isolated incident. A quick look at Claude Code’s GitHub issues reveals numerous frustrated developers facing similar challenges:
The community's frustration can be summarized into three main pain points:
Vector search-based RAG addresses these issues by leveraging semantic understanding:
While grep has its place in certain scenarios, it’s clear that vector search-based RAG offers significant advantages for AI coding assistants. It not only makes search faster and more accurate but also reduces token usage by 40% or more. For developers looking to streamline their workflow and reduce costs, the shift to vector search is a no-brainer.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
27 August 2025
133 articles
Related Articles
Related Articles
More Stories