Ingestion Primer
QuestDB makes top performance "data-in" easy.
Choose from existing third party tools, first-party protocols, or both.
Whether you're working with:
- financial market data
- sensors
- analytics
- event driven apps
And applying tools like:
Or looking to apply the first-party protocols:
Then this guide will prepare you to get the most out of (and into!) QuestDB.
#
Protocol ingest methodsThe core ingestion methods which are based off of popular protocols.
You may interact with them directly, or through third party tools.
Network Endpoint | Default Port | Inserting & modifying data |
---|---|---|
InfluxDB Line Protocol | 9000 | High performance streaming (Recommended!) |
PostgreSQL Wire Protocol | 8812 | SQL INSERT , UPDATE |
HTTP REST API | 9000 | SQL INSERT , UPDATE , CSV |
Web Console | 9000 | SQL INSERT , UPDATE , CSV |
All protocols benefit from the core features and benefits of QuestDB.
Deduplication, out-of-order indexing and performance through wide tables and high cardinality data are always present.
And no matter which method you choose, querying (data-out) is handled via extended SQL.
If you're unsure which method is right for you, consider joining the Slack community to speak to other developers.
We'll introduce the protocols one-by-one, and link out to deeper reference materials.
#
InfluxDB Line Protocol (ILP)Recommended!
InfluxDB Line Protocol (ILP) is the recommended ingestion method for high
performance applications. ILP is an insert-only protocol that bypasses SQL
INSERT
statements, thus achieving significantly higher throughput. It is the
fastest way to insert data, and it excels with high volume data streaming. The
QuestDB clients leverage ILP by default, and many of the third party tools and
integrations utilize it too.
An example of "data-in" via ILP appears as such:
As the example shows, ILP is a text protocol over HTTP (or TCP) which offers ease-of-use. No upfront schema is required; tables are created automatically if they do not already exist. The protocol thrives in situations where multiple streams ingest into a single source. It also supports on-the-fly, concurrent schema changes. For health management, there is error handling and a health check endpoint.
If you'd like to apply the InfluxDB Line Protocol, see:
InfluxDB Line Protocol Reference provides examples, depths on the message format, ports and authentication.
(Recommended!) QuestDB client libraries for user-friendly ILP clients in a growing number of languages.
#
PostgreSQL Wire Protocol (PGWire)PostgreSQL Wire Protocol (PGWire) provides ingestion interoperability with the
PostgreSQL ecosystem. QuestDB supports most PostgreSQL keywords and functions,
including parameterized queries and psql
on the command line. All together,
support of PGWire for ingestion and PostgreSQL for querying and data-out allows
QuestDB to connect with and query using various PostgreSQL third party client
libraries and tools.
By default, PGWire runs over TCP port 8812
and applies SQL INSERT
and COPY
statements. In contrast to streaming cases - for which we recommend the
InfluxDB Line Protocol -
PGWire is better suited for applications that INSERT
via SQL programmatically.
PGWire provides parameterized queries and therefore avoids tricky SQL injection
issues.
If PostgreSQL is the answer for your team, see:
- PostgreSQL Wire Protocol reference for examples, an overview and references to deeper techniques.
#
REST HTTP APIThe HTTP REST API provides a REST API for importing data, exporting data, and querying. It offers compatibility with a wide range of libraries and tools and is what powers the QuestDB Web Console.
To continue with the REST HTTP API, checkout:
- REST HTTP API provides an overview, examples and more.
#
Easy CSV uploadFor GUI-driven CSV upload which leverages the REST HTTP API, use the Import tab in the Web Console:
For all CSV import methods, including using the APIs directly, see the CSV Import Guide.
#
QuestDB and toolchainsQuestDB is an essential part of high performance data architecture. As such, it provides interoperability with other tools and services. Depending on your needs, QuestDB may help process, ingest, organize, accelerate or store your data. The use cases are many!
For ingest specifically, it is common for QuestDB to be on the receiving end of a service such as Apache Kafka. Kafka is a fault-tolerant message broker that excels at streaming. Its ecosystem provides tooling which - given the popularity of Kafka - can be used in alternative services and tools like Redpanda, similar to how QuestDB supports the InfluxDB Line Protocol.
- Apply the Kafka Connect based QuestDB Kafka connector
- Apply our Kafka Connect based generic JDBC Connector
- Write a custom program to read data from Apache Kafka and write to QuestDB
- Use a stream processing engine
Each strategy has different trade-offs.
The rest of this section discusses each strategy and guides users who are already familiar with the Kafka ecosystem.
#
Apache Kafka#
QuestDB connectorRecommended for most people!
QuestDB develops a first-party QuestDB Kafka connector. The connector is built on top of the Kafka Connect framework and uses the InfluxDB Line Protocol for communication with QuestDB. Kafka Connect handles concerns such as fault tolerance and serialization. It also provides facilities for message transformations, filtering and so on.
The underlying InfluxDB Line Protocol ensures operational simplicity and excellent performance. It can comfortably insert over 100,000s of rows per second. Leveraging Apache Connect also allows QuestDB to connect with Kafka-compatible applications like Redpanda.
Read our QuestDB Kafka connector guide to get started, with either self-hosted or QuestDB Cloud instances.
#
JDBC connectorSimilar to the QuestDB Kafka connector, the JDBC connector also uses the Kafka Connect framework. However, instead of using a dedicated InfluxDB Line Protocol stream, it relies on a generic JDBC binary and QuestDB's PGWire compatibility. Similar to how the QuestDB Connector can be used with Kafka-compatible utilities like Redpanda, the JDBC connector works with utilities such as Apache Spark and other tools.
The JDBC connector requires objects in Kafka to have associated schema and overall it is more complex to set up and run. Compared to the QuestDB Kafka connector, the JDBC connector has significantly lower performance, but the following advantages:
- Higher consistency guarantees than the fire-and-forget QuestDB Kafka connector
- Various Kafka-as-a-Service providers often have the JDBC connector pre-packaged
Recommended, if the QuestDB Kafka connector cannot be used.
#
Customized programWriting a dedicated program reading from Kafka topics and writing to QuestDB tables offers great flexibility. The program can do arbitrary data transformations and filtering, including stateful operations.
On the other hand, it's the most complex strategy to implement. You'll have to deal with different serialization formats, handle failures, etc. This strategy is recommended for very advanced use cases only.
Not recommended for most people.
#
Stream processingStream processing engines provide a middle ground between writing a dedicated program and using one of the connectors. Engines such as Apache Flink provide rich API for data transformations, enrichment, and filtering; at the same time, they can help you with shared concerns such as fault-tolerance and serialization. However, they often have a non-trivial learning curve.
QuestDB offers a connector for Apache Flink. It is the recommended strategy if you are an existing Flink user, and you need to do complex transformations while inserting entries from Kafka into QuestDB.
#
Other third party toolsFor a full list of third party tools supported by QuestDB, see third party tools.
#
Next step - queriesDepending on your infrastructure, it should now be apparent which ingestion method is worth pursuing.
Of course, ingestion (data-in) is only half the battle.
Your next best step? Learn how to query and explore data-out from the Query & SQL Overview.
It might also be a solid bet to review timestamp basics.
We also assumed that you have data already.
No data yet? Just starting? No worries. We've got you covered!
There are several quick scaffolding options:
- QuestDB demo instance: Hosted, fully loaded and ready to go. Quickly explore the Web Console and SQL syntax.
- Create my first data set guide: Create
tables, use
rnd_
functions and make your own data. - Sample dataset repos: IoT, e-commerce, finance or git logs? Check them out!
- Quick start repos: Code-based quick starts that cover ingestion, querying and data visualization using common programming languages and use cases. Also, a cat in a tracksuit.
- Time series streaming analytics template: A handy template for near real-time analytics using open source technologies.
There are also one-click sample data sets available in QuestDB Cloud.