Capacity planning

It is important to plan and configure system resources before deploying QuestDB. This involves forecasting the requirements for CPU, memory, and network capacity, based on the expected demands of the system. Learn more about configuring these system resources with example scenarios that align with both edge cases and common setup configurations.

Most configuration settings - except OS settings - are configured in QuestDB using the server.conf configuration file, or as environment variables. For more information about applying configuration settings in QuestDB, see the configuration page.

To monitor the various metrics of a QuestDB instance, refer to the Prometheus monitoring page or the Logging & Metrics page.

Storage and filesystem#

Some of the aspects to consider regarding the storage of data and file systems.

Supported filesystem#

QuestDB officially supports the following filesystems:

  • APFS
  • EXT4
  • NTFS
  • OVERLAYFS (used by Docker)
  • XFS (ftype=1 only)

Other file systems supporting mmap feature may work with QuestDB but they should not be used in production, as QuestDB does not run tests on them.

When you use an unsupported file system, QuestDB logs this warning:

-> UNSUPPORTED (SYSTEM COULD BE UNSTABLE)"
caution

Users can't use NFS or similar distributed filesystems directly with a QuestDB database.

Write amplification#

In QuestDB, the write amplification is calculated using the formula from metrics: questdb_physically_written_rows_total / questdb_committed_rows_total.

When ingesting out-of-order data, a high disk write rate combined with high write amplification may reduce database performance.

For data ingestion over PostgreSQL Wire Protocol, or as a further step for InfluxDB Line Protocol ingestion, using smaller table partitions can reduce write amplification. This applies in particular to tables with partition directories exceeding several hundred MBs on disk. For example, PARTITION BY DAY could be reduced to PARTIION BY HOUR, PARTITION BY MONTH to PARTITION BY DAY, and so on.

Partition splitting#

Since QuestDB 7.2, heavily out-of-order commits may split partitions into smaller parts to reduce write amplification. When data is merged into an existing partition due to an out-of-order insert, the partition will be split into two parts: the prefix sub-partition and the suffix sub-partition.

Consider the following scenario:

  • A partition 2023-01-01.1 with 1,000 rows every hour, and therefore 24,000 rows in total.
  • Inserting one row with the timestamp 2023-01-01T23:00

When the out-of-order row 2023-01-01T23:00 is inserted, the partition is split into 2 parts:

  • Prefix: 2023-01-01.1 with 23,000 rows
  • Suffix (including the merged row):2023-01-01T75959-999999.2 with 1,001 rows

See Splitting and squashing time partitions for more information.

CPU and RAM configuration#

This section describes configuration strategies based on the forecasted behavior of the database.

RAM size#

We recommend having at least 8GB of RAM for basic workloads, and 32GB for more advanced ones.

For relatively small datasets i.e 4-40GB, and a read-heavy workload, performance can be improved by maximising use of the OS page cache. Users should consider increasing available RAM to improve the speed of read operations.

Memory page size configuration#

With frequent out-of-order (O3) writes over a large number of columns/tables, database performance may be impacted by large memory page sizes, as this increases the demand for RAM. The memory page, cairo.o3.column.memory.size, is set to 8M by default. This means that the table writer uses 16MB (2x8MB) RAM per column when it receives O3 writes. O3 write performance, and overall memory usage, may be improved by decreasing this value within the range [128K, 8M]. A smaller page size allows for a larger number of in-use columns, or otherwise frees up memory for other database processes to use.

CPU cores#

By default, QuestDB tries to use all available CPU cores. The guide on shared worker configuration explains how to change the default settings. Assuming that the disk is not bottlenecked on IOPS, the throughput of read-only queries scales proportionally with the number of available cores. As a result, a machine with more cores will provide better query performance.

Shared workers#

In QuestDB, there are worker pools which help to share out CPU-load between sub-systems.

The number of worker threads shared across the application can be configured, as well as affinity to pin processes to specific CPUs by ID. Shared worker threads service SQL execution subsystems and, in the default configuration, every other subsystem. More information on these settings can be found on the shared worker configuration page.

If you manually configure the thread pool sizes, please ensure that you test these changes for your specific use case. Raising the shared worker count from one thread per core, to two per core, may provide signficant speed up for certain queries, but could impact overall performance during heavy database load.

QuestDB will allocate CPU resources differently depending on how many CPU cores are available. This default can be overridden via configuration settings. We recommend at least 4 cores for basic workloads and 16 for advanced ones.

note

Prior to QuestDB 7.3.10, one thread would be dedicated to the InfluxDB Line Protocol writer, meaning that for a 4 core CPU, there would be 3 shared workers. Now by default, this thread will not be dedicated, and instead be allocated as part of the shared worker pool.

To emulate pre-7.3.10 behaviour, users must set the line.tcp.writer.worker.count config option to 1.

8 CPU cores or less#

QuestDB will configure a shared worker pool to handle everything.

The minimum size of the shared worker pool is 4 for any number of cores.

16 CPU cores or less#

InfluxDB Line Protocol I/O Worker pool is configured to use 2 CPU cores to speed up ingestion. The shared worker pool handles everything else, and is configured using the formula:

(cpuAvailable) - (cpuSpare) - (line.tcp.writer.worker.count) - (line.tcp.io.worker.count)

For example, with 16 cores, the shared pool will have 14 threads:

16 - 0 - 0 - 2 = 14

32 CPU cores or less#

The InfluxDB Line Protocol I/O Worker pool is configured to use 6 CPU cores to speed up ingestion. 1 CPU core is left as a spare for the OS to use. The shared worker pool handles everything else, and is configured using the formula:

(cpuAvailable) - (cpuSpare) - (line.tcp.writer.worker.count) - (line.tcp.io.worker.count)

For example, with 32 cores, the shared pool will have 25 threads:

32 - 1 - 0 - 6 = 25

33 CPU cores and more#

The InfluxDB Line Protocol I/O Worker pool is configured to use 6 CPU cores to speed up ingestion. 2 CPU cores are left as spares for the OS to use. The shared worker pool handles everything else, and is configured using the formula:

(cpuAvailable) - (cpuSpare) - (line.tcp.writer.worker.count) - (line.tcp.io.worker.count)

For example, with 64 cores, the shared pool will have 56 threads:

64 - 2 - 0 - 6 = 56

Writer page size#

The default page size for writers is 16MB. This should be adjusted according to your use case. For example, using a 16MB page-size, to write only 1MB of data is a waste of resources. To change this default value, set the cairo.writer.data.append.page.size option in server.conf:

server.conf
cairo.writer.data.append.page.size=1M

For more horizontal use cases i.e databases with a large number of small tables, the page sizes could be reduced more dramatically. This may better distribute resources, and help to reduce write amplification.

InfluxDB over TCP#

We have a documentation page dedicated to capacity planning for InfluxDB Line Protocol ingestion.

Postgres#

For clients sending data to QuestDB using the Postgres interface, the following configuration can be applied, which sets a dedicated worker and pins it with affinity to a CPU by core ID:

server.conf
pg.worker.count=4
pg.worker.affinity=1,2,3,4

Network Configuration#

For the InfluxDB Line Protocol, PostgreSQL Wire Protocol and HTTP, there are a number of configuration settings which control:

  • the number of clients that may connect
  • the internal I/O capacities
  • connection timeout settings

These settings are configured in the server.conf file, and follow the naming convention:

<protocol>.net.connection.<config>

Where <protocol> is one of:

  • http - HTTP connections
  • pg - PostgreSQL Wire Protocol
  • line.tcp - InfluxDB line protocol over TCP

And <config> is one of the following settings:

keydescription
limitThe number of simultaneous connections to the server. This value is intended to control server memory consumption.
timeoutConnection idle timeout in milliseconds. Connections are closed by the server when this timeout lapses.
hintApplicable only for Windows, where TCP backlog limit is hit. For example Windows 10 allows max of 200 connection. Even if limit is set higher, without hint=true, it won't be possible to serve more than 200 connections.
sndbufMaximum send buffer size on each TCP socket. If value is -1 socket send buffer remains unchanged from OS default.
rcvbufMaximum receive buffer size on each TCP socket. If value is -1, the socket receive buffer remains unchanged from OS default.

For example, this is a configuration for Linux with a relatively low number of concurrent connections:

server.conf InfluxDB Line Protocol network example configuration for a low number of concurrent connections
# bind to all IP addresses on port 9009
line.tcp.net.bind.to=0.0.0.0:9009
# maximum of 30 concurrent connection allowed
line.tcp.net.connection.limit=30
# nothing to do here, connection limit is quite low
line.tcp.net.connection.hint=false
# connections will time out after 60s of no activity
line.tcp.net.connection.timeout=60000
# receive buffer is 4MB to accomodate large messages
line.tcp.net.rcvbuf=4M

This is an example for when one would like to configure InfluxDB Line Protocol for a large number of concurrent connections, on Windows:

server.conf InfluxDB Line Protocol network example configuration for large number of concurrent connections on Windows
# bind to specific NIC on port 9009, NIC is identified by IP address
line.tcp.net.bind.to=10.75.26.3:9009
# large number of concurrent connections
line.tcp.net.connection.limit=400
# Windows will not allow 400 client to connect unless we use the "hint"
line.tcp.net.connection.hint=true
# connections will time out after 30s of inactivity
line.tcp.net.connection.timeout=30000
# receive buffer is 1MB because messages are small, smaller buffer will
# reduce memory usage, 400 connections times 1MB = 400MB RAM required to handle input
line.tcp.net.rcvbuf=1M

For more information on the default settings for the http and pg protocols, refer to the server configuration page.

Pooled connections#

Connection pooling should be used for any production-ready use of PostgreSQL Wire Protocol or InfluxDB Line Protocol over TCP.

The maximum number of pooled connections is configurable, (pg.connection.pool.capacity for PostgreSQL Wire Protocol and (line.tcp.connection.pool.capacity for InfluxDB Line Protocol over TCP. The default number of connections for both interfaces is 64. Users should avoid using too many connections, as large numbers of concurrent connections will increase overall CPU usage.

OS configuration#

Changing system settings on the host OS can improve QuestDB performance. QuestDB may reach system limits relating to maximum open files, and virtual memory areas.

QuestDB writes operating system errors to its logs unchanged. We only recommend changing the following system settings in response to seeing such OS errors in the logs.

Maximum open files#

QuestDB uses a columnar storage model, and therefore its core data structures relate closely to the file system. Columnar data is stored in its own .d file, per time partition. In edge cases with extremely large tables, frequent out-of-order ingestion, or a high number of table partitions, the number of open files may hit a user or system-wide maximum limit, causing reduced performance and other unwanted behaviours.

In Linux/MacOS environments, maximum open file limits for the current user:

# Soft limit
ulimit -Sn
# Hard limit
ulimit -Hn

Setting the open file limit for the current user:#

On a Linux environment, one must increase the hard limit. On MacOS, both the hard and soft limits must be set. See Max Open Files Limit on MacOS for the JVM for more details.

Modify user limits using ulimit:

# Hard limit
ulimit -H -n 1048576
# Soft limit
ulimit -S -n 1048576

The system-wide limit should be increased correspondingly.

Setting the system-wide open file limit on Linux:#

To increase this setting and persist this configuration change, the limit on the number of concurrently open files can be amended in /etc/sysctl.conf:

/etc/sysctl.conf
fs.file-max=1048576

To confirm that this value has been correctly configured, reload sysctl and check the current value:

# reload configuration
sysctl -p
# query current settings
sysctl fs.file-max

Setting system-wide open file limit on MacOS:#

On MacOS, the system-wide limit can be modified by using launchctl:

sudo launchctl limit maxfiles 98304 2147483647

To confirm the change, view the current settings using sysctl:

sysctl -a | grep kern.maxf

Max virtual memory areas limit#

The database relies on memory mapping to read and write data to its files. If the host machine has low limits on virtual memory mapping areas, this can cause out-of-memory exceptions (errno=12). To increase this setting and persist this configuration change, mapped memory area limits can be amended in /etc/sysctl.conf:

/etc/sysctl.conf
vm.max_map_count=1048576

Each mapped area may consume ~128 bytes for each map count i.e 1048576 may use 1048576*128 = 134MB of kernel memory.

# reload configuration
sysctl -p
# query current settings
cat /proc/sys/vm/max_map_count

โญ Something missing? Page not helpful? Please suggest an edit on GitHub.