Speeding up InfluxDB line protocol

Tancrede Collard

Tancrede Collard

QuestDB Team

InfluxDB is the current market leader in time series. This post examines their ingestion format called InfluxDB line protocol (ILP) and compares data ingestion performance between QuestDB and InfluxDB. We'll look at data loss over UDP and some of the reasons why QuestDB is more efficient at ingesting records in ILP.

It would not be an overstatement to say that InfluxDB uses a lot of CPU. We set ourselves to build a receiver for ILP, which stores data faster than InfluxDB while being hardware efficient.

Update 09-07-2021: Our community contributor Yitaek Hwang has put together an article which compares InfluxDB, TimescaleDB and QuestDB which may be useful for an up-to-date feature overview.

Why does QuestDB now support InfluxDB line protocol?#

Starting with QuestDB 4.0.4, users can ingest data through ILP and use standard SQL to query InfluxDB data alongside other tables in a relational database while keeping the flexibility of ILP.

A diagram showing QuestDB ingesting schema-agnostic InfluxDB line protocol and relational data

Data loss over UDP#

We have conducted our testing over UDP, thus expecting some level of data loss. However, we did not anticipate that InfluxDB would lose so much. We have built a sender, which caches outgoing messages in a small buffer before sending them to a UDP socket. It sends data as fast as possible to overpower the consumers and introduce packet loss eventually. To test for different use cases, we have throttled the sender by varying its buffer size. A smaller buffer results in more frequent network calls and results in lower sending rates.

The benchmark publishes 50 million messages at various speeds. We then measure the number of entries in each DB after the fact to calculate the implied capture rate.

We use the Dell XPS 15 7590, 64Gb RAM, 6-core i9 CPU, 1TB SSD drive. In this experiment, both the sender and QuestDB/InfluxDB instances run on the same machine. UDP publishing is over loopback. OS is Fedora 31, OS UDP buffer size (net.core.rmem_max) is 104_857_600.

It comes down to ingestion speed#

Database performance is the bottleneck that results in packet loss. Messages are denied entry, and the loss rate is a direct function of the underlying database speed. By sending 50M messages at different speeds, we get the following outcome.

Table showing a comparison of the capture rate between QuestDB and InfluxDB
Capture rate as a function of sender speed

InfluxDB's capture rate rapidly drops below 50%, eventually converging toward single-digit rates.

Chart showing a comparison of the capture rate between QuestDB and InfluxDB
Capture rate as a function of sender speed

QuestDB's ingestion speed results are obtained through ILP. Our ingestion speed is considerably higher while using our native input formats instead.

Chart showing a comparison of the capture rate between QuestDB and InfluxDB
Implied ingestion speed in function of Sender speed

Why is the sender's rate slower for InfluxDB compared to QuestDB?#

In this test, we run the sender and the DB on the same machine, and it turns out that InfluxDB slows down our UDP sender by cannibalizing the CPU. Here is what happens to your CPUs while using InfluxDB:

Bar chart showing the CPU usage of InfluxDB, idle versus under load
InfluxDB’s CPU usage when serving requests

When in use, InfluxDB saturates all of the CPU. As a consequence, it slows down any other program running on the same machine.

QuestDB's secret sauce#

We maximize the utilization of each CPU, from which we extract as much performance as possible. For the example below, we compared InfluxDB's ingestion speed using 12 cores to QuestDB using one CPU core only. Despite utilizing one core instead of 12, QuestDB still outperforms InfluxDB significantly.

If spare CPU capacity arises, QuestDB will execute multiple data ingestion in parallel, leveraging multiple CPUs at the same time, but with one key difference; QuestDB uses work-stealing algorithms to ensure every last bit of CPU capacity is used while never being idle. Let us illustrate why this is the case.

Modern network cards have much superior throughput than the single receiver. Being limited to one receiver by design, InfluxDB considerably under-utilizes the network card, which is the limiting factor in the pipeline.

Chart showing InfluxDB's queuing mechanism
All CPU cores open one single receiver that under-utilizes the network card

Conversely, QuestDB can open parallel receivers (requiring one core each), fully utilizing the network card capabilities. The following illustration assumes that there would be spare CPU capacity in other cores to be filled. In such a scenario we would get QuestDB utilizing 12 cores, with each one of those being considerably faster than InfluxDB's combined 12 cores!

Chart showing QuestDB's queuing mechanism
Each CPU core opens an independent receiver working in parallel that fully leverages the network card

Besides ingestion, InfluxDB also saturates the CPU on queries. The current user cannibalizes the whole CPU, while other users have to wait for their turn.

How the CPU is shared under InfluxDB's load
Users monopolize all CPU cores one after the other

By contrast, QuestDB uses each core separately, allowing multiple users to query or write concurrently without delay. The performance gap between QuestDB and InfluxDB grows significantly as the number of simultaneous users increases.

How the CPU is shared under QuestDB's load
Users share CPU cores and are served concurrently, with full CPU core utilization.

Get started#

QuestDB supports ILP over UDP multicast and unicast sockets. TCP support will follow shortly. You don't need to change anything in your application. For Telegraf, you can configure the UDP sender for QuestDB's address and port.

Follow this link to download QuestDB. You can also use our sender against QuestDB and InfluxDB to reproduce the experiment.