Fluid real-time dashboards with Grafana and QuestDB
In this tutorial, how to create a real-time Grafana dashboard. The dashboard will have line charts as data visualizations that make use of aggregate SQL functions and Grafana global variables for sampling data based on dashboard settings. It's straight-forward and powerful!
We'll use the official QuestDB Grafana plugin. For further inspiration, checkout our NYC taxi and Crypto dashboards.
What is Grafana?
Grafana is an open-source visualization and dashboard tool for any type of data. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. With its powerful query language, you can quickly analyze complex data sets and create dynamic dashboards to monitor your applications and infrastructure. Grafana also provides an ever-growing library of plugins for data sources, panel types, and visualizations.
Grafana consists of a server that connects to one or more data sources to retrieve data, which the user then visualizes from their browser.
The following three Grafana features will be used in this tutorial:
- Data source - this is how you tell Grafana where your data is stored and how you want to access it. For this tutorial, we will have a QuestDB server running which we will access via Postgres Wire using the PostgreSQL data source plugin.
- Dashboard - A group of widgets that are displayed together on the same screen.
- Panel - A single visualization which can be a graph or table.
Setup
Start Grafana
Start Grafana using docker run
with the --add-host
parameter:
docker run --add-host=host.docker.internal:host-gateway \-p 3000:3000 --name=grafana \-v grafana-storage:/var/lib/grafana \grafana/grafana-oss
Once the Grafana server has started, you can access it via port 3000 (http://localhost:3000). The default login credentials are as follows:
user:adminpassword:admin
Start QuestDB
The Docker version for QuestDB can be run by exposing the port 8812
for the
PostgreSQL connection and port 9000
for the web and REST interface. Similar to
the Grafana container, we add the --add-host
parameter:
docker run --add-host=host.docker.internal:host-gateway \-p 9000:9000 -p 9009:9009 -p 8812:8812 -p 9003:9003 \-v "$(pwd):/var/lib/questdb" \-e QDB_PG_READONLY_USER_ENABLED=true \questdb/questdb:latest
Loading the dataset
On our live demo, uses 10+ years of taxi data. For this tutorial, we have a subset of that data, the data for the whole of February 2018. You can download the compressed dataset from Amazon S3:
curl https://s3-eu-west-1.amazonaws.com/questdb.io/datasets/grafana_tutorial_dataset.tar.gz > grafana_data.tar.gztar -xvf grafana_data.tar.gz
There should be two datasets available as .CSV
files:
weather.csv
taxi_trips_feb_2018.csv
These can be imported via curl using the /imp
REST entrypoint:
curl -F data=@taxi_trips_feb_2018.csv "http://localhost:9000/imp?timestamp=pickup_datetime&name=taxi_trips_feb_2018&partitionBy=Week"curl -F data=@weather.csv "http://localhost:9000/imp?timestamp=timestamp&name=weather&partitionBy=Month"
Creating your first visualization
Create a data source
In Grafana, select Data Sources under the Connections tab on the left hand panel and click the Add data source button. Navigate to the bottom of the page and click Find more data source plugins. Search for QuestDB and click Install.
Once the QuestDB data source for Grafana is finished installing, click on the blue Add new data source button where the Install button used to be. Finally, configure it with the following settings:
Server address:host.docker.internal
Server port: 8812
Username:user
Password:quest
TLS/SSL mode:disable
Note that by default, Grafana does not validate that queries are read-only. This
means it's possible to run queries such as drop table x
in Grafana which would
be destructive to a dataset.
To protect against this, we have set the environment variable
QDB_PG_READONLY_USER_ENABLED
to true when starting the QuestDB container. Once
this configuration is set, we enable a read-only user, and we could use the
default user (pg.readonly.user
= user) and password (pg.readonly.password
=
quest) to log in. More details for setting this configuration can be found on
QuestDB's Docker configuration page.
Create a dashboard and a panel
Now that we have a data source and a dashboard, we can add a panel. Click on the + symbol on the top of the page and select New dashboard. Click on the blue + Add visualization button.
Select the QuestDB data source we just created.
The new panel has a graphing area on the top half of the window and a query builder in the bottom half:
We can build queries either using the Query Builder or typing the whole query into the SQL Editor. We will create our first query visually and we will then go into SQL mode.
Our first query will show the average distance sampled by timestamp from the trips dataset.
Choose the taxi_trips_feb_2018
table from the Table dropdown and toggle
the Time Series option under Query Type.
We don't want to sample by any extra column other than the designated timestamp, so you can ignore the Sample by Keys dropdown.
Now we need to add the AVG
aggregation, so you need to click on the
Aggregate button and then select Average
, trip_distance
, and enter
distance
as the alias.
Since we want the results to be scoped to the dashboard time range, you need to
click on Filter, select the pickup_datetime
from the first dropdown, and
the WITHIN DASHBOARD TIME RANGE
value from the second dropdown. Please note
you might have to scroll down in the dropdown to find this option.
We are almost done. You only need to remove the default 100
value from the
Limit box, and we are ready to go.
Note that while you were choosing different options from the Query Builder, the SQL that will be sent to QuestDB is shown at the bottom. Click the time range selector above the chart and set the following date range:
- Set the From value to
2018-02-07 00:00:00
- Set the To value to
2018-02-14 23:59:59
and click Apply time range
We have built our first panel with aggregations:
To inspect — and manually edit if needed — your SQL, toggle the Query Builder to SQL Editor by clicking the button. The query editor will now accept SQL statements that we can input directly. It should display the following query, with our selections from the Query Builder:
SELECT pickup_datetime as time,
avg(trip_distance) distance
FROM "taxi_trips_feb_2018"
WHERE $__timeFilter(pickup_datetime)
SAMPLE BY $__sampleByInterval
ALIGN TO CALENDAR
Query details
To build a time-series dashboard in Grafana, the results need to be sorted by time. In QuestDB, we typically don't need to do anything as results tend to be sorted already. Check out Grafana time-series queries for more information.
To graph the average trip distance above, we use the avg()
function on the
trip_distance
column. This function aggregates data over the specified
sampling interval. If the sampling interval is 1-hour, we are calculating
the average distance traveled during each 1-hour interval. You can find more
information on QuestDB
aggregate functions on our documentation.
There are also 2 key Grafana-specific expressions used which can be identified
by the $__
prefix:
$__interval
This is a dynamic interval based on the time range applied to the
dashboard. By using this function, the sampling interval changes automatically
as the user zooms in and out of the panel.
$__timeFilter(pickup_datetime)
tells Grafana to send the start-time and
end-time defined in the dashboard to the QuestDB server. Given the settings we
have configured so far with our date range, Grafana translates this to the
following:
pickupDatetime BETWEEN '2018-02-01T00:00:00Z' AND '2018-02-28T23:59:59Z'
These are global variables which can be used in queries and elsewhere in panels and dashboards. When using the Query Editor, those variables are automatically included depending on our choices, but when using the SQL Editor you need to pass them manually. To learn more about the use of these variables, refer to the Grafana reference documentation on Global variables.
Finally, we use alias such as SELECT pickup_datetime AS time
in all the
queries. This is because Grafana expects a column named "time" for any
time-series chart. Note the builder automatically added the alias for us.
Adding multiple queries
You can add multiple queries to the same panel which will display multiple lines
on a graph. To demonstrate this, separate the taxi data into two series, one for
cash payments and one for card payments. The first query will have a default
name of A
--Cash
SELECT pickup_datetime AS time,
avg(trip_distance) AS cash
FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime))
WHERE $__timeFilter(pickup_datetime)
AND payment_type IN ('Cash')
SAMPLE BY $__interval;
Click + Query to add a second query (automatically labeled B
) and paste
the following in text mode:
--Card
SELECT pickup_datetime AS time,
avg(trip_distance) AS card
FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime))
WHERE $__timeFilter(pickup_datetime)
AND payment_type IN ('Card')
SAMPLE BY $__interval;
Both queries are now layered on the same panel with a green line for cash and a yellow line for card payments:
We can see in this graph that the distance traveled by those paying with cards is longer than for those paying with cash. This could be due to the fact that users usually carry less cash than the balance in their card.
Let’s add another panel by selecting Dashboards and + New dashboard:
This time, we will add the following query:
SELECT
pickup_datetime AS "time",
count()
FROM ('taxi_trips_feb_2018' timestamp(pickup_datetime))
WHERE $__timeFilter(pickup_datetime)
SAMPLE BY $__interval;
This is what our query looks like when viewing a time range of 28 days:
Zooming in to a single day shows more detailed data points as we are sampling by
Grafana's $__interval
property:
The daily cycle of activity is visible, with rides peaking in the early evening and reaching a low in the middle of the night.
ASOF JOIN
ASOF JOIN joins 2 tables based on timestamps that do not exactly match. To join the taxi trips data with weather data, enter the following query:
SELECT
pickup_datetime as "time",
avg(fare_amount) as fare_amount,
avg(rain1H) as rain1H
FROM (('taxi_trips_feb_2018' timestamp(pickup_datetime)) WHERE $__timeFilter(pickup_datetime))
ASOF JOIN (weather timestamp(timestamp))
SAMPLE BY $__interval;
To view a selected week in February 2018, select the time range picker above the chart:
- Set the From value to
2018-02-07 00:00:00
- Set the To value to
2018-02-14 23:59:59
and click Apply time range - Enable dual Y-axis in the option panel by using Overrides, simply
assigning different axis placements and units for fields
fare_amount
andrain1H
.
In this graph, we have 2 series, in green we have the fare amount sampled dynamically, and in yellow we have the average precipitation per hour in millimeters. From the graph, it’s hard to say whether there is a correlation between rain and the amount spent on taxi rides.
Conclusion
We have learned how to import time series data into QuestDB and build a dashboard with multiple queries in Grafana. We can use this dashboard to explore the relationship between taxi fares and rainfall, if any.
With a dual Y-axis, we can easily compare the two datasets and identify any correlations. Data analysis with a visualization like this can be extremely useful, as it can provide insights into how weather conditions and taxi fares affect each other.
If you like this content and want to see more tutorials about third-party integrations, let us know in our community forums.