Migrating from InfluxDB
This page describes the steps to importing data from InfluxDB OSS or InfluxDB Cloud to QuestDB.
#
OverviewData stored in InfluxDB needs to be exported in order to import it into QuestDB. There are two ways of exporting data from InfluxDB:
- Run a SQL query using InfluxDB API and get results in JSON
- Run the
inspect
command and get results in InfluxDB Line Protocol
The first approach is simpler and might suffice if you are migrating a small to moderate dataset. For larger datasets it is advised to use the second option.
#
SQL query for importing small tablesnote
This approach is recommended only for small to moderate datasets.
When using InfluxDB API to run a SQL query, results will be in JSON. Since we cannot import JSON files directly into QuestDB, we will need to convert the results into CSV. There are many ways to do so, and one of those is using the jq JSON processor.
To run the SQL query you will need to have an API token.
The below is an example to query a table using the SQL API endpoint and convert the results to CSV:
The resulting CSV can be then imported into QuestDB.
#
The inspect command and InfluxDB Line Protocol for importing datasets at scaleTo move data from InfluxDB into QuestDB at scale, it is best to use the
influxd inspect
command to export the data, as the
export-lp
subcommand allows exporting all time-structured merge tree (TSM) data in a
bucket as InfluxDB Line Protocol messages in a big text file.
The text file can then be inserted into QuestDB. This assumes you are migrating
from self-managed InfluxDB and have access to execute the inspect
command.
For InfluxDB Cloud users, the first step should be exporting the data from cloud to InfluxDB OSS before following the instructions.
#
Instructions#
Generate admin tokenMake sure you have an admin token generated and set the env variable
export INFLUX_TOKEN
.
For example:
#
Find out your org_id and bucket_idYou need to know your org_id and bucket_id you will export. If you don’t know
them you can
issue influx org list
and influx bucket list --org-id YOUR_ID
to find those
values.
#
Export the bucket contentsNow you can just export the bucket contents by using inspect export-lp
command
and by defining a destination folder:
Please note the step above can take a while. As an example, it took almost an hour for a 160G bucket on a mid-AWS EC2 instance.
#
Connect to QuestDBConnect to your QuestDB instance and issue a CREATE TABLE statement. This is not technically needed as once you start streaming data, your table will be automatically created. However, this step is recommended because this allows fine tuning some parameters such as column types or partition.
Since the data is already in InfluxDB Line Protocol format, there is no need to use the official QuestDB client libraries for ingestion.
You only need to connect via a socket to your instance and stream row by row.
The below is an example Python code streaming the instance:
#
Transform data in QuestDBSince InfluxDB exports only one metric for each line, this means that if you are
storing more than one metric for the same series, one row will create multiple
InfluxDB Line Protocol lines with one valid metric value and the other metrics
shown as NULL
. Therefore, we recommend transforming your data in QuestDB.
For example, if you query a table with several metrics:
Your result may be something like this:
A way to solve this is to execute a SQL query grouping data by all the dimensions and selecting the maximum values for all the metrics:
This produces aggregated rows containing all the metrics for each dimension group:
You can use the INSERT keyword to output the processed result into a new table.