The nature of time-series data is that the relevance of information diminishes over time. If stale data is no longer required, users can delete old data from QuestDB to either save disk space or adhere to a data retention policy. This is achieved in QuestDB by removing data partitions from a table.
This page provides a high-level overview of partitioning with examples to drop data by date. For more details on partitioning, see the partitioning page.
#Strategy for data retention
A simple approach to removing stale data is to drop data that has been
partitioned by time. A table must have a
designated timestamp assigned and a
partitioning strategy specified during a
CREATE TABLE operation to achieve
Users cannot alter the partitioning strategy after a table is created.
Tables can be partitioned by one of the following:
DROP PARTITION with care, as QuestDB cannot recover data from dropped
To drop partitions, users can use the ALTER TABLE DROP PARTITION syntax. Partitions may be dropped by:
DROP PARTITION LIST- specifying a comma-separated list of partitions to drop--Delete a partitionALTER TABLE my_table DROP PARTITION LIST '2021-01-01';--Delete a list of two partitionsALTER TABLE my_table DROP PARTITION LIST '2021-01-01', '2021-01-02';
WHERE timestamp =- exact date matching by timestampALTER TABLE my_table DROP PARTITIONWHERE timestamp = to_timestamp('2021-01-01', 'yyyy-MM-dd');
WHERE timestamp <- using comparison operators (
>) to delete by time range relative to a timestamp. Note that the
now()function may be used to automate dropping of partitions relative to the current time, i.e.:--Drop partitions older than 30 daysWHERE timestamp < dateadd('d', -30, now())
- The most chronologically recent partition cannot be deleted
- Arbitrary partitions may be dropped, which means they may not be the oldest chronologically. Depending on the types of queries users are performing on a dataset, it may not be desirable to have gaps caused by dropped partitions.
The following example demonstrates how to create a table with partitioning and
to drop partitions based on time. This example produces 5 days' worth of data
with one incrementing
LONG value inserted per hour.
For reference, the following functions are used to generate the example data:
- timestamp sequence with 1 hour stepping
- row generator with
long_sequence()function which creates a
The result of partitioning is visible when listing as directories on disk:
Partitions can be dropped using the following query: