What is anomaly detection?

Anomaly detection is an important part of time series analysis where data points with significant deviations from the rest of the data are identified. These deviations are known as anomalies. Deviations can be defined and measured in terms of mathematical measures of distance, such as against the standard deviation or via the output of some user-defined values, like threshold. They can also be well outside of any previously expected or defined boundary.

Algorithms for anomaly detection

While there are many approaches for anomaly detection in time series data, common approaches can be categorized into the following buckets:

  • Threshold-based: The simplest approach is to define a threshold so that any data point that falls outside these ranges are marked as anomalies. This approach works great for use cases where the values are bounded and where normal behavior is known beforehand. For example, we can set thresholds for CPU and memory usage based on historical usage. In IoT applications, thresholds are often used to detect overheating
  • Statistical methods: For use cases where a threshold is not immediately apparent, we can apply statistical methods to mark anomalies based on some mathematical measure of distance from normal values. These methods include calculating the standard deviation or the Z-score. Exponential smoothing algorithms may be used to determine the expected or normal behavior and data points that deviate far from those values are marked as anomalies
  • Machine learning algorithms: For complex data sets or for use cases requiring high degrees of accuracy, machine learning algorithms may be used. Tree-based algorithms such as isolation forest algorithms are popular due to their low memory requirement. Other common approaches include support vector machines (SVMs) and auto-encoders

Applications of anomaly detection

Anomaly detection in time series data is used in various industries:

  • Financial: fraud detection in financial transactions is one of the most important applications of anomaly detection. Other use cases include detecting outliers in stock trading activity for compliance or for generating new stock trading ideas
  • Network monitoring: many security teams employ some version of anomaly detection within their network to detect a security breach or other malicious attacks
  • Medical: time series data in medical applications are used to detect anomalies in biomarkers that may indicate malfunctioning bodily functions such as irregular heartbeats. Personal health devices are also monitoring for anomalies in movement or other data points, like heart rate and blood pressure, to detect falls or other sudden changes in behavior
  • Internet-of-Things (IoT): common applications of anomaly detection in IoT use cases involve detecting failing sensors or predictive maintenance. Outliers from sensor readings can indicate impending need for replacement or repair