The REST API provides an
/imp endpoint exposed on port
9000 by default. This
endpoint streams tabular text data directly into a table, supporting CSV, TAB
and pipe (
|) delimited inputs with optional headers. Data types and structures
are detected automatically, but additional configuration can be provided to
improve automatic detection.
QuestDB uses a
text_loader.json configuration file which can be placed in the
conf directory. This file does not exist by default, but has the
following implicit settings:
Given a CSV file which contains timestamps in the format
yyyy-MM-dd - HH:mm:ss.SSSUUU, the following text loader configuration will
provide the correct timestamp parsing:
The CSV data can then be loaded via POST request, for example, using cURL:
For more information on the
/imp entry point, refer to the
REST API documentation.
batch size parameters during
INSERT AS SELECT
statements is a convenient strategy to load and order large datasets from CSV in
bulk when they contain out-of-order data.
The batch size specifies how many records to attempt to bulk insert at one time and the lag allows for specifying the expected lateness of out-of-order timestamp values (in microseconds):
Given a large dataset
weather-unordered.csv with out-of-order records, an
ordered table may be created using the following steps:
Import the unordered dataset via the Web Console. The REST API may also be used for the same operation:Importing unordered CSV data via curlcurl -F email@example.com 'http://localhost:9000/imp'
Create a table with the schema of the imported data and apply a partitioning strategy. Records are not yet inserted due to the use of
WHERE 1 != 1. The
timestampcolumn may be cast as a
timestampif the import did not automatically detect the correct format:CREATE TABLE weather AS (SELECTcast(timestamp AS timestamp) timestamp,windDir,windSpeed,windGust,rain1H,rain6H,rain24HFROM 'weather-unordered.csv' WHERE 1 != 1) timestamp(timestamp) PARTITION BY DAY;
Insert the unordered records into the partitioned table and provide a
batchsize:INSERT batch 100000 commitLag 180s INTO weatherSELECTcast(timestamp AS timestamp) timestamp,windDir,windSpeed,windGust,rain1H,rain6H,rain24HFROM 'weather-unordered.csv';
To confirm that the table is ordered, the
isOrdered() function may be used:
More information about the use of
isOrdered() can be found on the
boolean functions documentation.