Fluid real-time dashboards with Grafana and QuestDB

QuestDB is the world's fast growing time-series database. It offers premium ingestion throughput, enhanced SQL analytics that can power through analysis, and cost-saving hardware efficiency. It's open source and integrates with many tools and languages.

In this tutorial, how to create a real-time Grafana dashboard. The dashboard will have line charts as data visualizations that make use of aggregate SQL functions and Grafana global variables for sampling data based on dashboard settings. It's straight-forward and powerful!

We'll use the official QuestDB Grafana plugin. For further inspiration, checkout our NYC taxi and Crypto dashboards.

What is Grafana?

Grafana is an open-source visualization and dashboard tool for any type of data. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. With its powerful query language, you can quickly analyze complex data sets and create dynamic dashboards to monitor your applications and infrastructure. Grafana also provides an ever-growing library of plugins for data sources, panel types, and visualizations.

Grafana consists of a server that connects to one or more data sources to retrieve data, which the user then visualizes from their browser.

The following three Grafana features will be used in this tutorial:

  1. Data source - this is how you tell Grafana where your data is stored and how you want to access it. For this tutorial, we will have a QuestDB server running which we will access via Postgres Wire using the PostgreSQL data source plugin.
  2. Dashboard - A group of widgets that are displayed together on the same screen.
  3. Panel - A single visualization which can be a graph or table.

Setup

Start Grafana

Start Grafana using docker run with the --add-host parameter:

docker run --add-host=host.docker.internal:host-gateway \
-p 3000:3000 --name=grafana \
-v grafana-storage:/var/lib/grafana \
grafana/grafana-oss

Once the Grafana server has started, you can access it via port 3000 (http://localhost:3000). The default login credentials are as follows:

user:admin
password:admin

Start QuestDB

The Docker version for QuestDB can be run by exposing the port 8812 for the PostgreSQL connection and port 9000 for the web and REST interface. Similar to the Grafana container, we add the --add-host parameter:

docker run --add-host=host.docker.internal:host-gateway \
-p 9000:9000 -p 9009:9009 -p 8812:8812 -p 9003:9003 \
-v "$(pwd):/var/lib/questdb" \
-e QDB_PG_READONLY_USER_ENABLED=true \
questdb/questdb:latest

Loading the dataset

On our live demo, uses 10+ years of taxi data. For this tutorial, we have a subset of that data, the data for the whole of February 2018. You can download the compressed dataset from Amazon S3:

curl https://s3-eu-west-1.amazonaws.com/questdb.io/datasets/grafana_tutorial_dataset.tar.gz > grafana_data.tar.gz
tar -xvf grafana_data.tar.gz

There should be two datasets available as .CSV files:

  • weather.csv
  • taxi_trips_feb_2018.csv

These can be imported via curl using the /imp REST entrypoint:

curl -F data=@taxi_trips_feb_2018.csv "http://localhost:9000/imp?timestamp=pickup_datetime&name=taxi_trips_feb_2018&partitionBy=Week"
curl -F data=@weather.csv "http://localhost:9000/imp?timestamp=timestamp&name=weather&partitionBy=Month"

Creating your first visualization

Create a data source

In Grafana, select Data Sources under the Connections tab on the left hand panel and click the Add data source button. Navigate to the bottom of the page and click Find more data source plugins. Search for QuestDB and click Install.

Once the QuestDB data source for Grafana is finished installing, click on the blue Add new data source button where the Install button used to be. Finally, configure it with the following settings:

Server address:host.docker.internal Server port: 8812 Username:user Password:quest TLS/SSL mode:disable
Screenshot of configuring data source

Note that by default, Grafana does not validate that queries are read-only. This means it's possible to run queries such as drop table x in Grafana which would be destructive to a dataset.

To protect against this, we have set the environment variable QDB_PG_READONLY_USER_ENABLED to true when starting the QuestDB container. Once this configuration is set, we enable a read-only user, and we could use the default user (pg.readonly.user = user) and password (pg.readonly.password = quest) to log in. More details for setting this configuration can be found on QuestDB's Docker configuration page.

Create a dashboard and a panel

Now that we have a data source and a dashboard, we can add a panel. Click on the + symbol on the top of the page and select New dashboard. Click on the blue + Add visualization button.

Screenshot of a new dashboard with a 'Add new panel' button

Select the QuestDB data source we just created.

The new panel has a graphing area on the top half of the window and a query builder in the bottom half:

Screenshot of a blank panel after being created

We can build queries either using the Query Builder or typing the whole query into the SQL Editor. We will create our first query visually and we will then go into SQL mode.

Our first query will show the average distance sampled by timestamp from the trips dataset.

Choose the taxi_trips_feb_2018 table from the Table dropdown and toggle the Time Series option under Query Type.

We don't want to sample by any extra column other than the designated timestamp, so you can ignore the Sample by Keys dropdown.

Now we need to add the AVG aggregation, so you need to click on the Aggregate button and then select Average, trip_distance, and enter distance as the alias.

Since we want the results to be scoped to the dashboard time range, you need to click on Filter, select the pickup_datetime from the first dropdown, and the WITHIN DASHBOARD TIME RANGE value from the second dropdown. Please note you might have to scroll down in the dropdown to find this option.

We are almost done. You only need to remove the default 100 value from the Limit box, and we are ready to go.

Screenshot of the Query Editor with the selections we just made.

Note that while you were choosing different options from the Query Builder, the SQL that will be sent to QuestDB is shown at the bottom. Click the time range selector above the chart and set the following date range:

  • Set the From value to 2018-02-07 00:00:00
  • Set the To value to 2018-02-14 23:59:59 and click Apply time range

We have built our first panel with aggregations:

A panel showing the average distance traveled, filtered by taxi type and with dynamic sampling.

To inspect — and manually edit if needed — your SQL, toggle the Query Builder to SQL Editor by clicking the button. The query editor will now accept SQL statements that we can input directly. It should display the following query, with our selections from the Query Builder:

SELECT pickup_datetime as time, avg(trip_distance) distance FROM "taxi_trips_feb_2018" WHERE $__timeFilter(pickup_datetime) SAMPLE BY $__sampleByInterval ALIGN TO CALENDAR
Screenshot showing how to toggle text edit mode

Query details

To build a time-series dashboard in Grafana, the results need to be sorted by time. In QuestDB, we typically don't need to do anything as results tend to be sorted already. Check out Grafana time-series queries for more information.

To graph the average trip distance above, we use the avg() function on the trip_distance column. This function aggregates data over the specified sampling interval. If the sampling interval is 1-hour, we are calculating the average distance traveled during each 1-hour interval. You can find more information on QuestDB aggregate functions on our documentation.

There are also 2 key Grafana-specific expressions used which can be identified by the $__ prefix:

$__interval This is a dynamic interval based on the time range applied to the dashboard. By using this function, the sampling interval changes automatically as the user zooms in and out of the panel.

$__timeFilter(pickup_datetime) tells Grafana to send the start-time and end-time defined in the dashboard to the QuestDB server. Given the settings we have configured so far with our date range, Grafana translates this to the following:

pickupDatetime BETWEEN '2018-02-01T00:00:00Z' AND '2018-02-28T23:59:59Z'

These are global variables which can be used in queries and elsewhere in panels and dashboards. When using the Query Editor, those variables are automatically included depending on our choices, but when using the SQL Editor you need to pass them manually. To learn more about the use of these variables, refer to the Grafana reference documentation on Global variables.

Finally, we use alias such as SELECT pickup_datetime AS time in all the queries. This is because Grafana expects a column named "time" for any time-series chart. Note the builder automatically added the alias for us.

Adding multiple queries

You can add multiple queries to the same panel which will display multiple lines on a graph. To demonstrate this, separate the taxi data into two series, one for cash payments and one for card payments. The first query will have a default name of A

--Cash SELECT pickup_datetime AS time, avg(trip_distance) AS cash FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime)) WHERE $__timeFilter(pickup_datetime) AND payment_type IN ('Cash') SAMPLE BY $__interval;

Click + Query to add a second query (automatically labeled B) and paste the following in text mode:

--Card SELECT pickup_datetime AS time, avg(trip_distance) AS card FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime)) WHERE $__timeFilter(pickup_datetime) AND payment_type IN ('Card') SAMPLE BY $__interval;

Both queries are now layered on the same panel with a green line for cash and a yellow line for card payments:

A panel showing the average distance travelled, filtered by taxi type and with dynamic sampling.

We can see in this graph that the distance traveled by those paying with cards is longer than for those paying with cash. This could be due to the fact that users usually carry less cash than the balance in their card.

Let’s add another panel by selecting Dashboards and + New dashboard:

A screenshot showing the add new dashboard option.

This time, we will add the following query:

SELECT pickup_datetime AS "time", count() FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime)) WHERE $__timeFilter(pickup_datetime) SAMPLE BY $__interval;

This is what our query looks like when viewing a time range of 28 days:

A panel showing the number of trips over a month using dynamic sampling.

Zooming in to a single day shows more detailed data points as we are sampling by Grafana's $__interval property:

A panel showing the number of trips in a day using dynamic sampling.

The daily cycle of activity is visible, with rides peaking in the early evening and reaching a low in the middle of the night.

ASOF JOIN

ASOF JOIN joins 2 tables based on timestamps that do not exactly match. To join the taxi trips data with weather data, enter the following query:

SELECT pickup_datetime as "time", avg(fare_amount) as fare_amount, avg(rain1H) as rain1H FROM (('taxi_trips_feb_2018' timestamp(pickup_datetime)) WHERE $__timeFilter(pickup_datetime)) ASOF JOIN (weather timestamp(timestamp)) SAMPLE BY $__interval;

To view a selected week in February 2018, select the time range picker above the chart:

  • Set the From value to 2018-02-07 00:00:00
  • Set the To value to 2018-02-14 23:59:59 and click Apply time range
  • Enable dual Y-axis in the option panel by using Overrides, simply assigning different axis placements and units for fields fare_amount and rain1H.
Assign different placement and units respectively for average fare amount and hourly precipitation.

In this graph, we have 2 series, in green we have the fare amount sampled dynamically, and in yellow we have the average precipitation per hour in millimeters. From the graph, it’s hard to say whether there is a correlation between rain and the amount spent on taxi rides.

A panel showing taxi fares plotted against rainfall

Conclusion

We have learned how to import time series data into QuestDB and build a dashboard with multiple queries in Grafana. We can use this dashboard to explore the relationship between taxi fares and rainfall, if any.

With a dual Y-axis, we can easily compare the two datasets and identify any correlations. Data analysis with a visualization like this can be extremely useful, as it can provide insights into how weather conditions and taxi fares affect each other.

If you like this content and want to see more tutorials about third-party integrations, let us know in our community forums.

RedditHackerNewsX
Subscribe to our newsletters for the latest. Secure and never shared or sold.