This submission comes from one of our community contributors Yitaek Hwang.
In Part I of this series, we used Google Data Studio to quickly import multiple data sources and compare the price action of various cryptocurrencies over time. Even though Google Data Studio provides an easy user-interface and some nice graphing features, it was limited in what it could do in terms of analyzing the dataset. In this post, we will use the popular Python libraries (pandas, matplotlib, and seaborn) to explore the dataset further.
NOTE: If you still have QuestDB running from Part I on Google Cloud, you can skip to the next step.
For simplicity, we will run QuestDB locally via Docker and import the datasets via the console UI. Download and run the latest version of QuestDB:
Navigate to localhost:9000, click on the “Upload” icon on the left-hand panel and import the csv files of interest. This example will use the Solana dataset, but any of the coins from the dataset will also work.
Now install the packages we will use to explore the dataset:
If you used Anaconda Navigator, go under Environments > Search Packages to install:
Now we’re ready to launch the notebook and start exploring the data:
First, we need to use the
psycopg2 library to connect to QuestDB and import
(NOTE: if you are re-using QuestDB from Part I, change the host to the IP address of the load balancer.)
Now we can run some quick queries to make sure our import was successful. The
tail functions are useful in this case for a quick sanity check:
Alternatively, we can use the
describe commands to get a sense of
the data types and distribution:
For good measure, we can also check for
na values to make sure we’re
working with a clean dataset.
These queries should return with
False for all the columns. If you have
missing values in your dataset, you can use the
dropna function to remove that
row or column.
Now that we have our dataset in Jupyter, we can run answer some simple
questions. For example, we can find the five lowest price of Solana by running
nsmallest function on the column we’re interested in (e.g. High, Low,
Open, Close). Similarly, we can use the
nlargest function for the opposite:
We can also find days when the open price was lower than the closing price by
To get a better sense of the trends, we can resample the dataset. For example,
we can get the mean prices by week by using the
resample function on the
Now our data is aggregated by weeks. Similarly, we can get a rolling average
over a set window via the
Note that seaborn chose some sane defaults for the x-axis. We can also specify our own labels as well as legends like the plot below comparing the High, Low, and Opening prices of Solana over time:
From a quick glance, it is hard to tell the trends of any of these price
actions. Let’s dive into a time period when the Solana price starts to become
more volatile (i.e. after Jan 2021). We can specify a subset of the dataset
loc function and apply it to our full dataset, weekly mean resample,
and seven day rolling mean.
After giving each of the dataset a different marker, we can plot all of them on the same graph to see when the opening price was above or below the weekly or moving window average:
We can zoom in further in the May to July timeframe to capture the volatility.
Alternatively, we can also apply the popular 200-day moving average metric to
analyze price trends. Finally, we can also use the
ewm function in pandas to
calculate an exponentially weighed moving average to gather different momentum
indicators. While past performance is not indicative of future performance,
these momentum indicators can be used to backtest and formulate new trading or
price analysis strategies.
With QuestDB and pandas, it’s easy to visualize and calculate various price and momentum indicators. By polling data from CoinGecko or CoinMarketCap, anyone can start analyzing crypto trends. If you are interested in more crypto analysis, make sure to check out: