
Share
Semantic search enables coding agents to navigate complex codebases more intuitively, answering queries with pinpoint accuracy and boosting efficiency by an average of 12.5%.
When coding agents receive a prompt, they need to understand the codebase thoroughly to provide accurate responses. This involves reading files and searching for relevant information. One tool that significantly enhances this process is semantic search, which retrieves code segments based on natural language queries like “where do we handle authentication?” in addition to traditional regex-based searches provided by tools like grep.
Cursor’s coding agents leverage semantic search to improve their performance over large codebases. Here’s a breakdown of how they achieve this:
Custom Embedding Model: Cursor trained its own embedding model specifically for codebase navigation. This model captures the semantic meaning of code segments, making it easier to match natural language queries.
Indexing Pipelines: Efficient indexing pipelines ensure fast retrieval of relevant code snippets. These pipelines are optimized for large-scale codebases.
Using semantic search, Cursor’s agents exhibit several notable improvements:
Higher Accuracy: On average, agents achieve 12.5% higher accuracy in answering questions (ranging from 6.5% to 23.5% depending on the model).
Better Code Retention: The code changes produced by agents are more likely to be retained in user codebases.
Fewer Iterations: Users require fewer iterations to arrive at a correct solution, reducing development time and effort.
Consistent Gains Across Models: All tested models, including frontier coding models, show improved accuracy with semantic search.
Cursor maintains an evaluation dataset called Cursor Context Bench, which focuses on retrieving information in codebases with known correct answers. This dataset is used to evaluate all of the most-used models in Cursor, including their custom model, Composer.

To understand the impact on end-user experience, Cursor conducted an A/B test where both groups used the same model, but one group's agent had access to semantic search while the other relied solely on traditional search tools like grep. The results were telling:
Code Retention: Code written by agents with access to semantic search is more likely to remain in user codebases. There was a 0.3% increase in code retention, which jumps to 2.6% for large codebases with 1,000 files or more.
Dissatisfied User Requests: Agents without semantic search required more follow-ups and corrections. The test showed a 2.2% increase in dissatisfied follow-up user requests when semantic search was not available.
The effect size is lower in the A/B tests because they cover all agent queries, many of which do not require search.
A key factor enabling these results is Cursor’s custom embedding model. This model is trained on agent sessions, where each session involves multiple searches and file openings before finding the right code. By analyzing these traces, the model learns to identify relevant code segments more effectively:
Training Data: Agent sessions provide rich training data that captures the context in which searches are performed.
Retrospective Analysis: Post-session analysis helps refine the model by identifying patterns in successful and unsuccessful searches.
Semantic search is a powerful tool for enhancing coding agent performance. By integrating custom embedding models and efficient indexing pipelines, Cursor’s agents can navigate large codebases more accurately and efficiently, leading to better code retention and fewer user follow-ups. This approach not only improves the developer experience but also accelerates development cycles.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
6 November 2025
88 articles
Related Articles
Related Articles
More Stories