SELECT
SELECT
allows you to specify list of columns and expressions to be selected
and evaluated from a table.
Syntax
Note that the
TABLE
you query from can either be a table in your database (in which case you would pass the table's name), or the result of a sub query.
Simple Select
All columns
QuestDB supports SELECT * FROM tablename
. When selecting all, you can also
omit most of the statement and simply pass the table name.
The two examples below are equivalent
SELECT * FROM ratings;
// is equivalent to //
ratings;
Specific columns
To select specific columns, replace * by the names of the columns you are interested in.
Example:
SELECT movieId, rating FROM ratings;
Arithmetic Expressions
SELECT
is capable of evaluating multiple expressions and functions. You can
mix comma separated lists of expressions with the column names you are
selecting.
SELECT movieId, (100  rating)*2, rating > 3.5 good
FROM ratings;
The result of rating > 3.5
is a boolean. The column will be named good and
take values true or false.
Aliases
Using aliases allow you to give expressions or column names of your choice. You can assign an alias to a column or an expression by writing the alias name you want after that expression
Alias names and column names must be unique. Example:
SELECT movieId alias1, rating alias2
FROM ratings
Aggregation
Aggregation functions can be used in arithmetic expressions.
Aggregate functions
Currently implemented aggregate functions:
Function  Remarks 

sum(expr) 
Sums values of numeric expr . Return type is double for expr of types
double ,float ; int for expr of types byte ,
short ,int and long for expr of type long

lsum(expr) 
Sums values of expr and returns long . This function can sum int values without
overflowing the result.

avg(expr) 
Calculates average value of expr .

max(expr) 
Calculates maximum value of expr . Return type is the same as type of expr

min(expr) 
Calculates minimum value of expr . Return type is the same as type of expr

first(expr) 
First value of expr in natural order of records.

last(expr) 
Last value of expr in natural order of records.

count()  Calculates count of records in query. 
count(expr) 
Calculates count of records with nonnull values of expr

var(expr) 
Calculates variance of expr .

stddev(expr) 
Calculates standard deviation of values of expr

Aggregation by group
QuestDB evaluates aggregation functions without need for traditional GROUP BY
.
Simply use a mix of column names and aggregation functions in a SELECT
clause.
You can have any number of discrete value columns and any number of aggregation
functions.
SELECT movieId, avg(rating), count() FROM ratings;
// is equivalent to //
SELECT movieId, avg(rating), count() FROM ratings GROUP BY movieId;
Aggregation arithmetic
Aggregation functions can be used in arithmetic expressions. The following
computes mid
of rating values for every movie.
SELECT movieId, (min(rating) + max(rating))/2 mid, count() count FROM ratings;
WHERE clause
Syntax
QuestDB supports standard WHERE
clause for filtering data. Filter expressions
are required to return boolean result.
SELECT COLUMN1, COLUMN2...
FROM TABLENAME
WHERE CONDITIONS;
CONDITIONS
are expressions that return boolean results.
Operators
Advanced conditions can be built using logical operators.
List of supported BOOLEAN
operators:
Operator  Remarks 

and  Logical AND operator. 
or  Logical OR operator. 
not 
Logical NOT operator. Example where not(x > 10)

<  Numerical LESS than. 
>  Numerical GREATER than. 
<=  Numerical LESS than OR EQUAL. 
>=  Numerical GREATER than OR EQUAL. 
= 
Equality operator applicable to all types. where x = 'ABC' and y = 10

eq 
Equality operator applicable to double type. It compares two double values with a delta.
where eq(x, 0.5, 0.000001) . This is logically equivalent to abs(x0.5) < 0.000001

!=  Nonequality operator applicable to all types. 
in 
Returns true when value of left argument is one of list of values of right argument.
where x in ('ABC','CDE','HJK') . This operator is performs better than logically equivalent
where x = 'ABC' or x = 'CDE' or x = 'HJK'

~ 
Returns true when value of left argument matches regular expression. where x ~ 'THE'
Here THE is the pattern. In this case it matches all strings that contain THE . For more
details on the pattern please refer to
Java Pattern documentation

Exact timestamp
Use =
operator and UTC date string for exact timestamp matches:
SELECT * FROM ratings WHERE timestamp = '20100112T00:02:26.000Z'
QuestDB SQL optimiser will create more efficient plan when data is time series naturally ordered by timestamp.
Interval timestamp
QuestDB supports interval timestamp search with comparison operators. However, to obtain best performance, we recommend using our timestamp search notation described below
>
,>=
,<
,<=
operators:
Using SELECT * FROM ratings WHERE timestamp > '20100112T00:00:00.000Z' and timestamp < '20100112T00:59:59.999Z'
in
operator:
Using SELECT * FROM ratings WHERE timestamp in ('20100112T00:00:00.000Z', '20100112T00:59:59.999Z')
in
is inclusive of edges and supports exactly two UTC timestamps.
Using QuestDB timestamp search notation
Using =
operator and partial UTC timestamp string. Example below selects data
between 14:00 and 14:59 on 12 January 2010:
SELECT * FROM ratings WHERE timestamp = '20100112T14'
To obtain intervals UTC timestamps can be truncated at seconds, minutes, days, months and years. This example selects data for the whole of 2011:
SELECT * FROM ratings WHERE timestamp = '2011'
Using =
operator and interval length modifier to specify longer than single
time unit intervals. This example selects 2pm, 3pm and 4pm data:
SELECT * FROM ratings WHERE timestamp = '20100112T14;2h'
QuestDB uses the following algorithm to create the interval: 20100112T14
is
translated into natural interval
[20100112T14:00:00.000Z, 20100112T14:59:59.999Z]
, then 2 hours added to
the upper bound resulting in
[20100112T14:00:00.000Z, 20100112T16:59:59.999Z]
interval.
Interval modifier format is:
where letters stand for:
Modifier  Remarks 

s  Seconds 
m  Minutes 
h  Hours 
d  Days 
M  Months 
y  Years 
Floating Point
You can use eq
operator to compare double
and float
values with tolerance
to avoid rounding problems. For example:
SELECT * FROM prices WHERE eq(bid, 1.56, 0.000001)
You can also use =
:
SELECT * FROM prices WHERE bid = 1.56
but =
would not match 1.56
and 1.559999999999
.
Best practice for floating point values would be to store as LONG integer using scaling factors to avoid roundingrelated issues.
Search using aggregation results
Subqueries can be used to filter on aggregation results in QuestDB SQL. It is
fairly simple because of the optional nature of select .. from
.
The following example selects all movies that received over 50,000 ratings.
(select movieId x, (min(rating) + max(rating))/2 mid, count() count from ratings) where count > 50000
Standard SQL equivalent would be:
select movieId, (min(rating) + max(rating))/2 mid, count() count from ratings
group by movieId
having count() > 50000
ORDER BY
Usage
ORDER BY
is used to sort the results of a query in ascending or descending
order.
Syntax
SELECT COLUMN1, COLUMN2...
FROM TABLENAME
ORDER BY COLUMN1 [ASC]DESC, COLUMN2 [ASC]DESC ...;
Parameter  Remarks 

COLUMN  Columns by which you wish to order 
ASC  To sort in ascending order 
DESC  To sort in descending order 
Note that ASC is optional and can be omitted.
Examples
Order by one column in ascending order:
ratings ORDER BY userId;
// is equivalent to //
SELECT * FROM ratings ORDER BY userId ASC;
Order by one column in descending order:
ratings ORDER BY userId DESC;
Order by several columns:
ratings ORDER BY userId, rating DESC;
// is equivalent to //
SELECT * FROM ratings ORDER BY userId ASC , rating DESC;
SELECT DISTINCT
Usage
SELECT DISTINCT
is used to return only distinct (i.e different) values.
Syntax
SELECT DISTINCT COLUMN1, COLUMN2, ...
FROM TABLE;
Example
The following query will return a list of all unique ratings in the table.
SELECT DISTINCT rating
FROM ratings;
SELECT DISTINCT can be used in conjunction with more advanced queries and filters.
The following query will return a list of all unique ratings in the table, and the number of times they occur.
SELECT DISTINCT rating, count()
FROM ratings;
The following query will return a list of all ratingsuserId couples in the table and hoy many times each user has assigned each rating. It is also filtered for ratings superior to 3
SELECT DISTINCT rating, userId, count()
FROM ratings
WHERE rating > 3;
LIMIT
Overview
LIMIT
is used to specify the number of records to return. Furthermore, you can
specify whether the position of the rows (first n rows, last n rows, n rows
after skipping m rows etc) .
In other implementations of SQL, this is sometimes replaced by statements such as
OFFSET
orROWNUM
Our implementation ofLIMIT
encompasses both in one statement.
Syntax
Statements with LIMIT
follow this syntax:
SELECT column1, column2, ...
FROM table
LIMIT NUM_ROWS;
Where NUM_ROWS
is the number of rows you want to return with the query.
Limit will by default start from the TOP of the table. If you would like to get results from the BOTTOM of the table, then l should be a negative number.
Examples
The following will return the TOP 5 results.
SELECT * FROM ratings LIMIT 5;
For a results table with rows from 1 to n, it will return rows [1, 2, 3, 4, 5]
The following will return the BOTTOM 5 results:
SELECT * FROM ratings LIMIT 5;
For a results table with rows from 1 to n, it will return rows [n5, n4, n3, n2, n1, n]
Range
You can use two parameters to return a range. To do so, you should use the syntax
SELECT . . FROM . . LIMIT n, m;
Where n
is the lower bound of your range (exclusive), and m
is the upper
bound of your range (inclusive)
For example, the following return records {3, 4, 5}
ratings LIMIT 2,5;
When used with negative
numbers, you can return a range starting from the
bottom.
For example, the following will return records between n7 (exclusive) and n3 (inclusive), i.e {n6, n5, n4, n3}
ratings LIMIT 7, 3;
SAMPLE BY
Overview
SAMPLE BY
is used on timeseries data to summarise large datasets into
aggregates of homogeneous time chunks.
To use
SAMPLE BY
, one column needs to be designated astimestamp
. Find out more in the CREATE TABLE section.
Syntax
SAMPLE BY
syntax is as follows:
SELECT columns
FROM table
SAMPLE BY nSAMPLE_SIZE
WHere SAMPLE_SIZE
is the unit of time by which you wish to aggregate your
results, and n
is the number of timechunks that will be summarised together.
You can sample by any multiple of the following:
Modifier  Remarks 

s  Seconds 
m  Minutes 
h  Hours 
d  Days 
M  Months 
y  Years 
Examples
Assume the following table
TRADES
===============================================
timestamp, buysell, quantity, price

ts1 B q1 p1
ts2 S q2 p2
ts3 S q3 p3
... ... ... ...
tsn B qn pn
The following will return the number of trades per hour:
SELECT timestamp, count()
FROM TRADES
SAMPLE BY 1h;
The following will return the trade volume in 30 minute intervals
SELECT timestamp, sum(quantity*price)
FROM TRADES
SAMPLE BY 30m;
The following will return the average trade notional (where notional is = q * p) by day:
SELECT timestamp, avg(quantity*price)
FROM TRADES
SAMPLE BY 1d;
FILL
Overview
FILL
is an option for SAMPLE BY
to determine how the results of a query are
displayed when one or more of your timeslices have no data. This is useful when
you would like your query results to be homogeneous in time: rather than
skipping the time chunk, the query will return a value determine by the type of
FILL you choose.
Syntax
SELECT timestamp, aggr1, aggr2, ...
FROM table
SAMPLE BY YOUR_SAMPLE_SIZE
FILL(FILL_OPTION_1, FILL_OPTION_2...);
FILL_OPTION
can be any of the following:
Option  Remarks 

NONE  Will not fill. In case there is no data, the timechunk will be skipped in the results. This means your table could potentially be missing intervals. 
NULL  Will fill with NULL 
PREV  Will fill using the previous data point. 
LINEAR  Will fill with the result of the linear interpolation of the surrounding 2 points. 
0  Will fill with 0. Note you can replace 0 with any number of your choice (e.g fill 100.00) 
Fill must be used on aggregations such as sum(), count() etc.
Examples
Consider the following table
PRICES
======================
timestamp, price

ts1 p1
ts2 p2
ts3 p3
... ...
tsn pn
We could run the following to get the minimum, maximum and average price per hour using the following query:
SELECT timestamp, min(price) min, max(price) max, avg(price) avg
FROM PRICES
SAMPLE BY 1h;
It would generally return result like this:
RESULTS
======================================
timestamp, min, max, average

ts1 min1 max1 avg1
... ... ... ...
tsn minn maxn avgn
However, in case there was no PRICES
data for a given hour, your table would
have timechunks missing. For example
RESULTS
======================================
timestamp, min, max, average

ts1 min1 max1 avg1
ts2 min2 max2 avg2
ts4 min4 max4 avg4
... ... ... ...
tsn minn maxn avgn
Here you can see that the third time chunk is missing. This is because there was no price update in the third hour. Let's see what different fill values would return:
SELECT timestamp, min(price) min, max(price) max, avg(price) avg
FROM PRICES
SAMPLE BY 1h
FILL(null, 0, prev);
would return:
RESULTS
======================================
timestamp, min, max, average

ts1 min1 max1 avg1
ts2 min2 max2 avg2
ts3 NULL 0 avg2 <<< FILL VALUES
ts4 min4 max4 avg4
... ... ... ...
tsn minn maxn avgn
And the following:
SELECT timestamp, min(price) min, avg(price) avg
FROM PRICES
SAMPLE BY 1h
FILL(25.5, linear);
Would return:
RESULTS
======================================
timestamp, min, average

ts1 min1 avg1
ts2 min2 avg2
ts3 25.5 (avg2+avg4)/2 <<< FILL VALUES
ts4 min4 avg4
... ... ...
tsn minn avgn