CREATE TABLE reference
To create a new table in the database, the CREATE TABLE
keywords followed by
column definitions are used.
#
Syntaxinfo
Checking table metadata can be done via the tables()
and table_columns()
functions which are described in the
meta functions documentation page.
#
IF NOT EXISTSAn optional IF NOT EXISTS
clause may be added directly after the
CREATE TABLE
keywords to indicate that a new table should be created if one
with the desired table name does not already exist.
#
Table nameInternally the table name is used as a directory name on the file system. It can
contain both ASCII and Unicode characters. The table name must be unique and
an error is returned if a table already exists with the requested name. Table
names containing spaces or period .
character must be enclosed in double
quotes, for example:
#
Column nameAs with table names, the column name is used for file names internally. Although it does support both ASCII and Unicode characters, character restrictions specific to the file system still apply. Tables may have up to 2,147,483,647 columns.
note
Column names must be unique within each table and must not contain a period
.
character.
#
Type definitionWhen specifying a column, a name and
type definition must be provided. The symbol
type may have additional optional parameters applied.
#
SymbolsOptional keywords and parameters may follow the symbol
type which allow for
further optimization on the handling of this type. For more information on the
benefits of using this type, see the symbol overview.
#
Symbol capacityCAPACITY
is an optional keyword used when defining a symbol type on table
creation to indicate how many distinct values this column is expected to have.
When distinctValueEstimate
is not explicitly specified, a default value of
cairo.default.symbol.capacity
is used.
distinctValueEstimate
- the value used to size data structures for
symbols. These data structures will resize themselves
when necessary to allow QuestDB to function correctly. Underestimating the
symbol value count may result in drop of performance whereas over-estimating may
result in higher disk space and memory consumption.
The symbol capacity is not to be confused with index capacity described in column indexes below.
#
Symbol cachingCACHE | NOCACHE
is used to specify whether a symbol should be cached. CACHE
means that QuestDB will use a Java Heap-based Map to resolve symbol values and
keys. When a column has a large number of distinct symbol values (over 100,000,
for example), the heap impact might be significant and may cause OutOfMemory
errors, depending on the heap size.
To avoid Java Heap impact, NOCACHE
leverages an off-heap structure which can
deal with larger value counts but is slower. The default option for symbol
types is CACHE
.
#
Casting typescastDef
- casts the type of a specific column. columnRef
must reference
existing column in the selectSql
#
Column indexesIndex definitions (indexDef
) are used to create an
index for a table column. The referenced table column
must be of type SYMBOL
.
An index capacity may be provided for the index using and references a
valueBlockSize
valueBlockSize
is an index storage parameter that specifies how many row IDs
to store in a single storage block on disk. This value is optional and will
default to the value of the cairo.index.value.block.size
configuration key. Fewer blocks used to store
row IDs achieves better performance. At the same time over-sizing
valueBlockSize
will result in higher than necessary disk space usage.
Consider an example table with 200 unique stock symbols and 1,000,000,000 records over time. The index will have to store 1,000,000,000 / 200 row IDs for each symbol, i.e. 5,000,000 per symbol.
- If
valueBlockSize
is 1,048,576 in this case, QuestDB will use 5 blocks to store the row IDs - If
valueBlockSize
is 1,024 in this case, the block count will be 4,883
#
Designated timestampThe timestamp function allows for specifying which column (which must be of
timestamp
type) should be a designated timestamp for the table. For more
information, see the designated timestamp
reference.
caution
The designated timestamp column cannot be changed after the table has been created.
#
PartitioningPARTITION BY
allows for specifying the
partitioning strategy for the table. The default
partitioning strategy is NONE
and tables can be partitioned by one of the
following:
YEAR
MONTH
DAY
HOUR
caution
The partitioning strategy cannot be changed after the table has been created.
#
WITH table parametersTable parameters which influence how often commits of out-of-order data occur
may be set during table creation using the WITH
keyword. The following two
parameters may be applied:
maxUncommittedRows
- equivalent tocairo.max.uncommitted.rows
commitLag
- equivalent tocairo.commit.lag
expects a value with a modifier to specify the unit of time for the value:unit description us microseconds s seconds m minutes h hours d days
Checking the values per-table may be done using the tables()
function:
id | name | maxUncommittedRows | commitLag |
---|---|---|---|
1 | my_table | 250000 | 240000000 |
2 | device_data | 10000 | 30000000 |
For more information on commit lag and the maximum uncommitted rows, see the guide for out-of-order commits and ILP commit strategy.
#
ExamplesThe following examples demonstrate creating tables from basic statements, and introduce features such as partitioning and designated timestamps. For more information on the concepts introduced to below, see
- designated timestamp reference on electing a timestamp column
- partition documentation which describes how partitions work in QuestDB
- symbol reference for using the
symbol
data type
This example will create a table without a designated timestamp and does not have a partitioning strategy applied.
The same table can be created and a designated timestamp may be specified. New records with timestamps which are out-of-order (O3) chronologically will be ordered at the point of ingestion. Configuring how the system handles ingestion of out-of-order records is done via commit lag configuration.
#
CREATE TABLE AS#
Cloning existing SQL structureWhen SQL is SELECT * FROM tab
or any arbitrary SQL result, the table data will
be copied with the corresponding structure.
note
Notice the where false
condition.
Here we changed type of price
(assuming it was INT
) to LONG
and changed
type of sym
to symbol and created an
index.
#
Create a new table using SQL structure and dataLet's assume we imported a text file into the table taxi_trips_unordered
and
now we want to turn this data into time series through ordering trips by
pickup_time
, assign dedicated timestamp and partition by month:
#
CREATE TABLE WITH parametersAdding WITH
to a create table statement allows for providing table-specific
configuration.
#
Specifying commit lag and maximum uncommitted rowsLet's assume we have out-of-order records arriving at a table my_table
. If we
know beforehand that the maximum lag of later records is likely to be 240
seconds, we can schedule sorting and commits of out-of-order data to occur
within this time boundary. The lag configuration can be combined with the
maximum uncommitted rows so that a commit will occur based on expected row
count, or the lag boundary is met:
For more information on out-of-order lag and uncommitted rows, see the documentation for out-of-order data commits.