CREATE TABLE reference

To create a new table in the database, the CREATE TABLE keywords followed by column definitions are used.

Syntax#

To create a table by manually entering parameters and settings:

Flow chart showing the syntax of the CREATE TABLE keyword

note

Checking table metadata can be done via the tables() and table_columns() functions which are described in the meta functions documentation page.

To create a table by cloning the metadata of an existing table:

Flow chart showing the syntax of the CREATE TABLE LIKE keyword

Write-Ahead Log (WAL) Settings#

By default, created tables are Write-Ahead Log enabled

While we recommend WAL-enabled tables, it is still possible to create non-WAL-enabled tables.

CREATE TABLE's global configuration setting allows you to alter the default behaviour via cairo.wal.enabled.default:

  • true: Creates a WAL table (default)
  • false: Creates a non-WAL table

And on an individual basis, you can also use BYPASS WAL.

The BYPASS WAL keyword indicates that the Write-Ahead Log should be bypassed. This means that changes to the table will not be logged in the WAL, which can improve performance in certain scenarios where the additional durability guarantees provided by the WAL are not required. However, bypassing the WAL means that if the database crashes or is not shut down cleanly, any recent changes to the table might be lost. And that you cannot utilize WAL-requiring features like deduplication.

To use BYPASS WAL during table creation, include it in the CREATE TABLE statement after the column definitions and before any other table parameters such as the designated timestamp or partitioning strategy, like so:

CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP) BYPASS WAL TIMESTAMP(ts) PARTITION BY DAY;

In the above statement, BYPASS WAL creates the table without using the WAL.

The table will have a designated timestamp column ts and will be partitioned by day.

-- Basic example without a designated timestamp and no partitioning, bypassing WAL
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING) BYPASS WAL;
-- Adding a designated timestamp and bypassing WAL
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING) BYPASS WAL TIMESTAMP(ts);
-- Adding a partitioning strategy by DAY and bypassing WAL
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING) BYPASS WAL TIMESTAMP(ts) PARTITION BY DAY;
-- Adding parameters for symbol type with BYPASS WAL
CREATE TABLE my_table(symb SYMBOL CAPACITY 256 NOCACHE INDEX CAPACITY 1048576, price DOUBLE, ts TIMESTAMP, s STRING) BYPASS WAL TIMESTAMP(ts) PARTITION BY DAY;
-- Create table as select with data manipulation and bypassing WAL
CREATE TABLE taxi_trips AS(
SELECT * FROM taxi_trips_unordered ORDER BY pickup_time
) BYPASS WAL TIMESTAMP(pickup_time) PARTITION BY MONTH;

We recommend using WAL-tables. Keep in mind that using BYPASS WAL is a trade-off between write performance and data safety, and goes against the grain. It should only be used when you're certain that the potential for data loss and the absence of features is acceptable in your use case.

IF NOT EXISTS#

An optional IF NOT EXISTS clause may be added directly after the CREATE TABLE keywords to indicate that a new table should be created if one with the desired table name does not already exist.

CREATE TABLE IF NOT EXISTS test_table(price DOUBLE, ts TIMESTAMP) timestamp(ts);

Table name#

Internally the table name is used as a directory name on the file system. It can contain both ASCII and Unicode characters. The table name must be unique and an error is returned if a table already exists with the requested name.

In addition, table names are case insensitive: example, exAmPlE, EXAMplE and EXAMPLE are all treated the same. Table names containing spaces or period . character must be enclosed in double quotes, for example:

CREATE TABLE "example out of.space" (a INT);
INSERT INTO "example out of.space" values (1);

Column name#

As with table names, the column name is used for file names internally. Although it does support both ASCII and Unicode characters, character restrictions specific to the file system still apply.

Tables may have up to 2,147,483,647 columns. Column names are also case insensitive. For example: example, exAmPlE, EXAMplE and EXAMPLE are all treated the same. However, column names must be unique within each table and must not contain a period . character.

Type definition#

When specifying a column, a name and type definition must be provided. The symbol type may have additional optional parameters applied.

Flow chart showing the syntax of the different column types

Symbols#

Optional keywords and parameters may follow the symbol type which allow for further optimization on the handling of this type. For more information on the benefits of using this type, see the symbol overview.

Symbol capacity#

CAPACITY is an optional keyword used when defining a symbol type on table creation to indicate how many distinct values this column is expected to have. When distinctValueEstimate is not explicitly specified, a default value of cairo.default.symbol.capacity is used.

distinctValueEstimate - the value used to size data structures for symbols.

CREATE TABLE my_table(symb SYMBOL CAPACITY 128, price DOUBLE, ts TIMESTAMP),
INDEX (symb) timestamp(ts);

The symbol capacity is not to be confused with index capacity described in column indexes below.

CREATE TABLE my_table
(symb SYMBOL capacity 128 NOCACHE INDEX capacity 256, price DOUBLE, ts TIMESTAMP)
timestamp(ts);

Symbol caching#

CACHE | NOCACHE is used to specify whether a symbol should be cached. The default value is CACHE unless otherwise specified.

CREATE TABLE my_table
(symb SYMBOL CAPACITY 128 NOCACHE, price DOUBLE, ts TIMESTAMP)
timestamp(ts);

Casting types#

castDef - casts the type of a specific column. columnRef must reference existing column in the selectSql

Flow chart showing the syntax of the cast function

CREATE TABLE test AS (SELECT CAST(x as DOUBLE) x FROM long_sequence(10));

IPv4 addresses#

Create an IPv4-friendly table using the following pattern:

CREATE TABLE traffic (ts timestamp, src ipv4, dst ipv4) timestamp(ts) PARTITION BY DAY;

This demonstrates:

  • creation of a table called traffic
  • columns for timestamp, source ipv4 address, destination ipv4 address
  • a partition based on a day

Column indexes#

Index definitions (indexDef) are used to create an index for a table column. The referenced table column must be of type symbol.

Flow chart showing the syntax of the index function

CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP),
INDEX (symb) TIMESTAMP(ts);

An index capacity may be provided for the index by defining the index storage parameter, valueBlockSize:

CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP),
INDEX (symb CAPACITY 128) TIMESTAMP(ts);
-- equivalent to
CREATE TABLE my_table(symb SYMBOL INDEX CAPACITY 128, price DOUBLE, ts TIMESTAMP),
TIMESTAMP(ts);

See Index for more information about index capacity.

CREATE TABLE AS#

When SQL (selectSQL) is SELECT * FROM tab or any arbitrary SQL result, the selected column names and their data type will be cloned to the new table.

Create table as select
CREATE TABLE new_table AS(
SELECT
rnd_int() a,
rnd_double() b,
rnd_symbol('ABB', 'CDD') c
FROM
long_sequence(100)
WHERE false
);

The data type of a column can be changed:

Clone an existing wide table and change type of cherry-picked columns
CREATE TABLE new_table AS (SELECT * FROM source_table WHERE false),
CAST(price AS LONG),
CAST(instrument as SYMBOL);

Here we changed type of price (assuming it was INT) to LONG and changed type of sym to symbol and created an index.

Designated timestamp#

The timestamp function allows for specifying which column (which must be of timestamp type) should be a designated timestamp for the table. For more information, see the designated timestamp reference.

The designated timestamp column cannot be changed after the table has been created.

Partitioning#

PARTITION BY allows for specifying the partitioning strategy for the table. Tables created via SQL are not partitioned by default (NONE) and tables can be partitioned by one of the following:

  • NONE: the default when partition is not defined.
  • YEAR
  • MONTH
  • WEEK
  • DAY
  • HOUR

The partitioning strategy cannot be changed after the table has been created.

Deduplication#

When Deduplication is enabled, QuestDB only inserts rows that do not match the existing data. When rows are inserted into a table with the deduplication option configured, QuestDB searches for existing rows to match using the specified UPSERT KEYS. If a match is found, the existing rows are replaced with the new row. If no match is found, the new rows are inserted into the table.

Deduplication can only be enabled for Write-Ahead Log (WAL) tables.

It is possible to include multiple columns of different types in the UPSERT KEYS list.

However, there are a few limitations to keep in mind:

  • The designated timestamp column must be included in the list of columns
  • Columns of STRING and BINARY types cannot be used in UPSERT KEYS list

After table creation the deduplication configuration can be changed at any time using ALTER table:

Examples#

Creating a table for tracking ticker prices with daily partitions and upsert deduplication
CREATE TABLE TICKER_PRICE (
ts TIMESTAMP,
ticker SYMBOL,
price DOUBLE
) TIMESTAMP(ts) PARTITION BY DAY WAL
DEDUP UPSERT KEYS(ts, ticker);
Enabling dedup on an existing table, for timestamp and ticker columns
ALTER TABLE TICKER_PRICE DEDUP ENABLE UPSERT KEYS(ts, ticker)
Disabling dedup on the entire table
ALTER TABLE TICKER_PRICE DEDUP DISABLE
Checking whether a table has dedup enabled
SELECT dedup FROM tables() WHERE table_name = '<the table name>'
Checking whether a column has dedup enabled
SELECT `column`, upsertKey from table_columns('<the table name>')

WITH table parameter#

Flow chart showing the syntax of keyword to specify WITH table parameter

The parameter influences how often commits of out-of-order data occur. It may be set during table creation using the WITH keyword.

maxUncommittedRows - defines the maximum number of uncommitted rows per-table to keep in memory before triggering a commit for a specific table.

The purpose of specifying maximum uncommitted rows per table is to reduce the occurrences of resource-intensive commits when ingesting out-of-order data.

The global setting for the same parameter is cairo.max.uncommitted.rows.

Setting out-of-order table parameters via SQL
CREATE TABLE my_table (timestamp TIMESTAMP) TIMESTAMP(timestamp)
PARTITION BY DAY WITH maxUncommittedRows=250000;

Checking the values per-table may be done using the tables() function:

List all tables
SELECT id, name, maxUncommittedRows FROM tables();
idnamemaxUncommittedRows
1my_table250000
2device_data10000

Table target volume#

The IN VOLUME clause is used to create a table in a different volume than the standard. The table is created in the specified target volume, and a symbolic link is created in the table's standard volume to point to it.

Flow chart showing the syntax of keywords to specify a table target volume

The use of the comma (,) depends on the existence of the WITH clause:

  • If the WITH clause is present, a comma is mandatory before IN VOLUME:

    CREATE TABLE my_table (i symbol, ts timestamp), index(i capacity 32) WITH maxUncommittedRows=7, IN VOLUME SECONDARY_VOLUME;
  • If no WITH clause is used, the comma must not be added for the IN VOLUME segment:

    CREATE TABLE my_table (i symbol, ts timestamp) IN VOLUME SECONDARY_VOLUME;

The use of quotation marks (') depends on the alias:

  • If the alias contains spaces, the quotation marks are required:

    CREATE TABLE my_table (i symbol, ts timestamp), index(i capacity 32) IN VOLUME 'SECONDARY VOLUME';
  • If the alias does not contain spaces, no quotation mark is necessary:

    CREATE TABLE my_table (i symbol, ts timestamp), index(i capacity 32) IN VOLUME SECONDARY_VOLUME;

Description#

The table behaves the same way as if it had been created in the standard (default) volume, with the exception that DROP TABLE removes the symbolic link from the standard volume but the content pointed to is left intact in its volume. A table using the same name in the same volume cannot be created again as a result, it requires manual intervention to either remove or rename the table's directory in its volume.

Configuration#

The secondary table target volume is defined by cairo.volumes in server.conf. The default setting contains an empty list, which means the feature is not enabled.

To enable the feature, define as many volume pairs as you need, with syntax alias -> volume-root-path, and separate different pairs with a comma. For example:

cairo.volumes=SECONDARY_VOLUME -> /Users/quest/mounts/secondary, BIN -> /var/bin

Additional notes about defining the alias and volume root paths:

  • Aliases are case-insensitive.
  • Volume root paths must be valid and exist at bootstrap time and at the time when the table is created.
  • Aliases and/or volume root paths can be single quoted, it is not required.

CREATE TABLE LIKE#

The LIKE keyword clones the table schema of an existing table without copying the data. Table settings and parameters such as designated timestamp, symbol column indexes, and index capacity will be cloned, too.

Create table like
CREATE TABLE new_table (LIKE my_table);

Examples#

The following examples demonstrate creating tables from basic statements, and introduce features such as partitioning and designated timestamps. For more information on the concepts introduced to below, see

  • designated timestamp reference on electing a timestamp column
  • partition documentation which describes how partitions work in QuestDB
  • symbol reference for using the symbol data type

This example will create a table without a designated timestamp and does not have a partitioning strategy applied.

Basic example
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING);

The same table can be created and a designated timestamp may be specified.

Adding a designated timestamp
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING)
TIMESTAMP(ts);
Adding a partitioning strategy by DAY
CREATE TABLE my_table(symb SYMBOL, price DOUBLE, ts TIMESTAMP, s STRING)
TIMESTAMP(ts)
PARTITION BY DAY;
Adding parameters for symbol type
CREATE TABLE
my_table(symb SYMBOL CAPACITY 256 NOCACHE INDEX CAPACITY 1048576,
price DOUBLE, ts TIMESTAMP, s STRING)
TIMESTAMP(ts)
PARTITION BY DAY;

Let's assume we imported a text file into the table taxi_trips_unordered and now we want to turn this data into time series through ordering trips by pickup_time, assign dedicated timestamp and partition by month:

Create table as select with data manipulation
CREATE TABLE taxi_trips AS(
SELECT * FROM taxi_trips_unordered ORDER BY pickup_time
) TIMESTAMP(pickup_time)
PARTITION BY MONTH;

โญ Something missing? Page not helpful? Please suggest an edit on GitHub.