Ingestion from Kafka Overview

Kafka is a fault-tolerant message broker that excels at streaming. Its ecosystem provides tooling which - given the popularity of Kafka - can be used in alternative services and tools like Redpanda, similar to how QuestDB supports the InfluxDB Line Protocol.

  1. Apply the Kafka Connect based QuestDB Kafka connector
  2. Apply our Kafka Connect based generic JDBC Connector
  3. Write a custom program to read data from Apache Kafka and write to QuestDB
  4. Use a stream processing engine

Each strategy has different trade-offs.

The rest of this section discusses each strategy and guides users who are already familiar with the Kafka ecosystem.

Apache Kafka#

QuestDB connector#

Recommended for most people!

QuestDB develops a first-party QuestDB Kafka connector. The connector is built on top of the Kafka Connect framework and uses the InfluxDB Line Protocol for communication with QuestDB. Kafka Connect handles concerns such as fault tolerance and serialization. It also provides facilities for message transformations, filtering and so on.

The underlying InfluxDB Line Protocol ensures operational simplicity and excellent performance. It can comfortably insert over 100,000s of rows per second. Leveraging Apache Connect also allows QuestDB to connect with Kafka-compatible applications like Redpanda.

Read our QuestDB Kafka connector guide to get started, with either self-hosted or QuestDB Cloud instances.

JDBC connector#

Similar to the QuestDB Kafka connector, the JDBC connector also uses the Kafka Connect framework. However, instead of using a dedicated InfluxDB Line Protocol stream, it relies on a generic JDBC binary and QuestDB's PGWire compatibility. Similar to how the QuestDB Connector can be used with Kafka-compatible utilities like Redpanda, the JDBC connector works with utilities such as Apache Spark and other tools.

The JDBC connector requires objects in Kafka to have associated schema and overall it is more complex to set up and run. Compared to the QuestDB Kafka connector, the JDBC connector has significantly lower performance, but the following advantages:

  • Higher consistency guarantees than the fire-and-forget QuestDB Kafka connector
  • Various Kafka-as-a-Service providers often have the JDBC connector pre-packaged

Recommended, if the QuestDB Kafka connector cannot be used.

Customized program#

Writing a dedicated program reading from Kafka topics and writing to QuestDB tables offers great flexibility. The program can do arbitrary data transformations and filtering, including stateful operations.

On the other hand, it's the most complex strategy to implement. You'll have to deal with different serialization formats, handle failures, etc. This strategy is recommended for very advanced use cases only.

Not recommended for most people.

Stream processing#

Stream processing engines provide a middle ground between writing a dedicated program and using one of the connectors. Engines such as Apache Flink provide rich API for data transformations, enrichment, and filtering; at the same time, they can help you with shared concerns such as fault-tolerance and serialization. However, they often have a non-trivial learning curve.

QuestDB offers a connector for Apache Flink. It is the recommended strategy if you are an existing Flink user, and you need to do complex transformations while inserting entries from Kafka into QuestDB.


โญ Something missing? Page not helpful? Please suggest an edit on GitHub.