Yahoo logoCase study

QuestDB enables machine learning engines that power Yahoo search

Yahoo use QuestDB in an embedded capacity within their machine learning engine deployed in systems that serve close to a billion users at a rate of 500k queries per second.

An advertisement for showing personalized search across multiple mobile devices

Dollar iconNo external monitoring solutions required for autoscaling decisions

Workflow iconNative timeseries support within the ML engine in an embedded capacity

Leaf iconHelps provide fault-tolerance at the software architecture layer

Gauge iconPowering web-scale systems with hundreds of millions of monthly users

Voice iconComprehensive documentation accelerated the integration

Time iconHigh-performance solution for real-time resource monitoring

Yahoo’s media, technology, and business platform serves content to hundreds of millions of users per month. To power search and recommendation, Yahoo relies on a custom machine learning engine which serves personalized content in real-time. High-performance is critical at this scale because every extra millisecond a consumer has to wait matters.

In this case study, VP Architect Jon Bratseth describes how and why QuestDB is relied upon within high-performance machine learning engines at Yahoo.

Why Yahoo uses QuestDB for our Vespa machine learning engine

Vespa is an open-source big data processing and serving engine powering applications at Yahoo. Many products, such as Yahoo News, Yahoo Sports, and Yahoo Finance, use this engine to store, search, organize, and make machine-learned inferences over big data at serving time.

These platforms serve close to a billion users, and we needed an embedded solution that monitors resource utilization metrics in nodes within our application clusters. We decided to use QuestDB to store and analyze application monitoring metrics quickly and easily within the application itself, removing external failure modes and leaving plenty of headroom for performance.

The scale and scope of our application monitoring needs

The Vespa platform performs machine-learned inferences over big data with high availability and high performance. The engine powers search, question answering for chatbots, recommendation and personalization, and typeahead suggestions for systems that serve close to a billion users at a rate of 500k queries per second. In these use cases, the engine needs to select a subset of data from a vast data corpus, evaluate machine-learned models over the selected data, organize and aggregate it, and return results in less than 100 milliseconds while the data corpus is continuously changing.

Chart showing a continuous integration pipeline for Yahoo's Vespa engine

How we capture and store application metrics in QuestDB

We’re running a large number of deployments on behalf of customers. Each consists of several clusters running on dedicated Docker containers. The clusters are controlled by a shared control plane mainly consisting of ’configuration server clusters’ where a single cluster runs per environment and region and handles application configuration.

Autoscaling lets you adjust the hardware resources allocated to application clusters automatically depending on actual usage. You want your application to use as few resources as possible to minimize cost, but at the same time, you don’t want to run out of resources when traffic is high or you feed more data. We want clusters to maintain the optimal allocation required to handle the current load at any time.

Chart showing resource utilization of nodes within Yahoo's Vespa engine

We run QuestDB embedded in the admin and configuration clusters to store a few days of resource usage of the nodes. The data is sampled continuously for each cluster or node, and we query the resource utilization data to make scaling decisions based on that data. We use QuestDB to act on the usage patterns observed on the system in the recent past. When users deploy a new cluster (or application), defaults initially configure the minimal resources provided within a given range. When engineers enable autoscaling for a cluster, it will continue unchanged until autoscaling determines that a change is beneficial.

Our ’ideal utilization’ considers that a node may be down or failing, that another region may be down, which can cause a doubling of traffic. We can also factor into the equation that we need headroom for maintenance operations and handling requests with low latency.

Why QuestDB is a good database to use for fault-tolerance

We decided that metric collection should be embedded within nodes that manage the clusters. We embed QuestDB to avoid failure scenarios where some parts of the cluster work, but others don’t. Using a time series database within Yahoo’s recommendation engine helps include fault-tolerance measures within the engine’s architecture and allows engineers to make AI-driven decisions using customer data in real-time, at any scale.

We use QuestDB to monitor metrics for autoscaling decisions within our ML engine that provides search, recommendation, and personalization via models and aggregations on continuously changing data.

Jon Brateseth, VP Architect at Yahoo