InfluxDB is the current market leader in time series. This post examines their ingestion format called InfluxDB line protocol (ILP) and compares data ingestion performance between QuestDB and InfluxDB. We'll look at data loss over UDP and some of the reasons why QuestDB is more efficient at ingesting records in ILP.
It would not be an overstatement to say that InfluxDB uses a lot of CPU. We set ourselves to build a receiver for ILP, which stores data faster than InfluxDB while being hardware efficient.
Update 09-07-2021: Our community contributor Yitaek Hwang has put together an article which compares InfluxDB, TimescaleDB and QuestDB which may be useful for an up-to-date feature overview.
Starting with QuestDB 4.0.4, users can ingest data through ILP and use standard SQL to query InfluxDB data alongside other tables in a relational database while keeping the flexibility of ILP.
The UDP receiver is deprecated since QuestDB version 6.5.2. We recommend the TCP receiver instead.
We have conducted our testing over UDP, thus expecting some level of data loss. However, we did not anticipate that InfluxDB would lose so much. We have built a sender, which caches outgoing messages in a small buffer before sending them to a UDP socket. It sends data as fast as possible to overpower the consumers and introduce packet loss eventually. To test for different use cases, we have throttled the sender by varying its buffer size. A smaller buffer results in more frequent network calls and results in lower sending rates.
The benchmark publishes 50 million messages at various speeds. We then measure the number of entries in each DB after the fact to calculate the implied capture rate.
We use the Dell XPS 15 7590, 64Gb RAM, 6-core i9 CPU, 1TB SSD drive. In this
experiment, both the sender and QuestDB/InfluxDB instances run on the same
machine. UDP publishing is over loopback. OS is Fedora 31, OS UDP buffer size
Database performance is the bottleneck that results in packet loss. Messages are denied entry, and the loss rate is a direct function of the underlying database speed. By sending 50M messages at different speeds, we get the following outcome.
InfluxDB's capture rate rapidly drops below 50%, eventually converging toward single-digit rates.
QuestDB's ingestion speed results are obtained through ILP. Our ingestion speed is considerably higher while using our native input formats instead.
In this test, we run the sender and the DB on the same machine, and it turns out that InfluxDB slows down our UDP sender by cannibalizing the CPU. Here is what happens to your CPUs while using InfluxDB:
When in use, InfluxDB saturates all of the CPU. As a consequence, it slows down any other program running on the same machine.
We maximize the utilization of each CPU, from which we extract as much performance as possible. For the example below, we compared InfluxDB's ingestion speed using 12 cores to QuestDB using one CPU core only. Despite utilizing one core instead of 12, QuestDB still outperforms InfluxDB significantly.
If spare CPU capacity arises, QuestDB will execute multiple data ingestion in parallel, leveraging multiple CPUs at the same time, but with one key difference; QuestDB uses work-stealing algorithms to ensure every last bit of CPU capacity is used while never being idle. Let us illustrate why this is the case.
Modern network cards have much superior throughput than the single receiver. Being limited to one receiver by design, InfluxDB considerably under-utilizes the network card, which is the limiting factor in the pipeline.
Conversely, QuestDB can open parallel receivers (requiring one core each), fully utilizing the network card capabilities. The following illustration assumes that there would be spare CPU capacity in other cores to be filled. In such a scenario we would get QuestDB utilizing 12 cores, with each one of those being considerably faster than InfluxDB's combined 12 cores!
Besides ingestion, InfluxDB also saturates the CPU on queries. The current user cannibalizes the whole CPU, while other users have to wait for their turn.
By contrast, QuestDB uses each core separately, allowing multiple users to query or write concurrently without delay. The performance gap between QuestDB and InfluxDB grows significantly as the number of simultaneous users increases.
QuestDB supports ILP over UDP multicast and unicast sockets. TCP support will follow shortly. You don't need to change anything in your application. For Telegraf, you can configure the UDP sender for QuestDB's address and port.