Migrating from InfluxDB
This page describes the steps to importing data from InfluxDB OSS or InfluxDB Cloud to QuestDB.
Data stored in InfluxDB needs to be exported in order to import it into QuestDB. There are two ways of exporting data from InfluxDB:
- Run a SQL query using InfluxDB API and get results in JSON
- Run the
inspectcommand and get results in ILP
The first approach is simpler and might suffice if you are migrating a small to moderate dataset. For larger datasets it is advised to use the second option.
#SQL query for importing small tables
This approach is recommended only for small to moderate datasets.
When using InfluxDB API to run a SQL query, results will be in JSON. Since we cannot import JSON files directly into QuestDB, we will need to convert the results into CSV. There are many ways to do so, and one of those is using the jq JSON processor.
To run the SQL query you will need to have an API token.
The below is an example to query a table using the SQL API endpoint and convert the results to CSV:
The resulting CSV can be then imported into QuestDB.
#The inspect command and ILP for importing datasets at scale
To move data from InfluxDB into QuestDB at scale, it is best to use the
influxd inspect command to export the data, as the
subcommand allows exporting all time-structured merge tree (TSM) data in a
bucket as ILP messages in a big text file.
The text file can then be inserted into QuestDB. This assumes you are migrating
from self-managed InfluxDB and have access to execute the
For InfluxDB Cloud users, the first step should be exporting the data from cloud to InfluxDB OSS before following the instructions.
#Generate admin token
Make sure you have an admin token generated and set the env variable
#Find out your org_id and bucket_id
You need to know your org_id and bucket_id you will export. If you don’t know
them you can
influx org list and
influx bucket list --org-id YOUR_ID to find those
#Export the bucket contents
Now you can just export the bucket contents by using
inspect export-lp command
and by defining a destination folder:
Please note the step above can take a while. As an example, it took almost an hour for a 160G bucket on a mid-AWS EC2 instance.
#Connect to QuestDB
Connect to your QuestDB instance and issue a CREATE TABLE statement. This is not technically needed as once you start streaming data, your table will be automatically created. However, this step is recommended because this allows fine tuning some parameters such as column types or partition.
Since the data is already in ILP format, there is no need to use the official QuestDB client libraries for ingestion.
You only need to connect via a socket to your instance and stream row by row.
The below is an example Python code streaming the instance:
#Transform data in QuestDB
Since InfluxDB exports only one metric for each ILP line, this means that if you are storing more than one metric for the same series, one row will create multiple ILP lines with one valid metric value and the other metrics shown as NULL. Therefore, we recommend transforming your data in QuestDB.
For example, if you query a table with several metrics:
Your result may be something like this:
A way to solve this is to execute a SQL query grouping data by all the dimensions and selecting the maximum values for all the metrics:
This produces aggregated rows containing all the metrics for each dimension group:
You can use the INSERT keyword to output the processed result into a new table.