turbopuffer: Consistently High Performance at Scale

Case Study: How turbopuffer leverages Polar Signals for Continuous Profiling

March 26, 2025

Overview

turbopuffer is an object storage native search engine designed to excel in vector search and is expanding into text search capabilities. As a performance-critical platform, turbopuffer required a solution that could provide deep insights into their system behavior, especially during regressions and performance bottlenecks. By integrating Polar Signals Cloud throughout their development process, turbopuffer successfully improves performance, enhances incident debugging, and increases overall system reliability.

Example Case
In a recent instance, turbopuffer noticed higher than usual latency and could immediately see std::std_float::StdFloat::mul_add dominating their CPU time in Polar Signals.

Iciclegraph showing StdFloat::mul_add as the dominating frame
Iciclegraph showing StdFloat::mul_add as the dominating frame

Solution

It immediately became evident to the team what the problem was after taking a look at Polar Signals Cloud because std::std_float::StdFloat::mul_add was calling __fmaffma3 which is a non-vectorized function. From there on, it was obvious that the code was not using a SIMD implementation using AVX512 as expected. It turns out that `RUST_FLAGS` configuring the target CPU was accidentally dropped in the build process. Ensuring that the binaries are built for the appropriate CPU targets resolved the issue.

Results

In this particular instance the overall CPU time used by the calling cosine_similarity function went from ~70% of the total CPU time spent by the workload, down to ~20%.

Iciclegraph showing that the std::std_float::StdFloat::mul_add function has visibly disappeared.
Iciclegraph showing that the std::std_float::StdFloat::mul_add function has visibly disappeared.

This translated to a ~30% latency improvement across key percentiles:

  • p50 reduced from 36ms to 26ms
  • p75 reduced from 70ms to 45ms
  • p90 reduced from 128ms to 86ms
improvement in terms of percentiles
improvement in terms of percentiles

Polar Signals Cloud’s profiling capabilities drastically reduced the time required to isolate and resolve this regression.

Beyond this case

Beyond this particular case, turbopuffer has been using Polar Signals Cloud to inform development decisions, and it has become a vital tool for incident response.

First instinct for our on-calls is now to check PolarSignals, whether memory grows unexpectedly quick, CPU is high, or we’re planning how to get the next throughput increase in in indexing—PolarSignals delivers the answer, fast. - Simon Eskildsen, CEO of turbopuffer

By leveraging Polar Signals Cloud, turbopuffer is able to deliver a consistently high-performance product to their customers.

Discuss:
Sign up for the latest Polar Signals news