Good data from the past helps us make better decisions in the present. Most of today's data were created within the past ten years, and human data output will only grow exponentially from here on. This sudden pervasiveness of data means that we need new ways to store and process information focusing on efficiency and sustainability. This article describes why speed and performance in a time-series database is the key to staying afloat in a sea of data.
How does QuestDB get the kind of performance it does, and how are we continuing to squeeze another 50-60% out of it? This post will look at a code change we thought would create a negative performance impact, which actually brought a substantial boost in the system's overall performance and demonstrates that we are constantly learning more about performance improvements.
A few weeks ago, I posted the story of how I started QuestDB on Hacker News. Several people found the story interesting, so I thought I would post it here and describe the passage from working at a large energy trading company, discovering memory-mapping approaches in Java, the beginnings of building the system as a side-project, and how we got to where we are today with companies relying on production instances of our time-series database.
We've been upping our SWAG game a lot lately, and we want to share it with you, the valuable members of our community! We want to give you the chance to show off your projects, show off your love for QuestDB, and to just show off! Whether it's large or small projects, follow the steps in this post so we can send some swag your way!
If you listen to, well, pretty much anyone rational, they will tell you in no uncertain terms that the last thing you ever want to do is put your SQL Database on the public internet. Even if you're crazy enough to do that, you certainly should never post the address to it on a place like Hacker News. We did it anyway, and this post describes why we did it, what we learned and what people tried to do with it.
SIMD instructions are specific CPU instruction sets for arithmetic calculations that use synthetic parallelization. This approach allows us to perform the same calculations and operations on numerous data points simultaneously. This post describes how SIMD works with typical operation performance and describes additional optimizations we managed to achieve.
Inter-thread messaging is a fundamental part of any asynchronous system. It is the component responsible for the transportation of data between threads. Messaging forms the infrastructure, scaffolding multi-threaded applications, and just like real-world transport infrastructure, we want it to be inexpensive, fast, reliable, and clean. For QuestDB, we wrote our own messaging system, and this post is about how it works and how fast it is.
InfluxDB is the current market leader in time series. This post examines their ingestion format called InfluxDB line protocol (ILP) and compares data ingestion performance between QuestDB and InfluxDB. We'll look at data loss over UDP and some of the reasons why QuestDB is more efficient at ingesting records in ILP.