
Share
Turbopuffer's FTS v2 accelerates long language model queries up to 20 times faster with vectorized MAXSCORE, offering a significant edge in processing large datasets and complex searches efficiently.
turbopuffer's latest version of their in-house text search engine, FTS v2, has seen a significant performance boost, especially when handling long queries generated by language models (LLMs). This improvement is thanks to two key enhancements: an optimized storage layout and a more efficient search algorithm. In this article, we'll focus on the new search algorithm, which leverages vectorized MAXSCORE over block-max WAND.
For those of us working with large datasets and complex queries, performance can make or break an application. turbopuffer's FTS v2 is designed to handle longer, more intricate queries, which are often generated by automated agents or LLMs. These queries can contain dozens of terms, including stopwords, making them particularly challenging for traditional search algorithms.
The core innovation in FTS v2 is the use of vectorized MAXSCORE, which is particularly effective for long queries. Here’s how it works:
Block-max WAND is a well-known algorithm for lexical search, but it can struggle with long queries. Here’s why FTS v2 outperforms it:

To demonstrate the performance gains, we ran benchmarks on a 5M-document Wikipedia export dataset. Here are some representative results:
"san francisco":
"the who":
"united states constitution":
"lord of the rings":
"pop singer songwriter born 1989 won best country song time person of the year" (a long, complex query):
For those interested in the technical nitty-gritty:
turbopuffer's FTS v2 represents a significant step forward in text search technology, especially for applications involving long, complex queries. By leveraging vectorized MAXSCORE and an optimized storage layout, the new engine delivers up to 20x faster performance compared to its predecessor. This improvement is not just a theoretical gain; it translates into real-world benefits for users and developers alike.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
11 December 2025
88 articles
Related Articles
Related Articles
More Stories