Matrix profile XXIV: Scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams

Lu, Yue; Wu, Renjie; Al Mueen, Abdullah; Zuluaga, Maria A.; Keogh, Eamonn J.
KDD 2022, 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 14-18 August 2022, Washington DC, USA

Time series anomaly detection remains one of the most active areas of research in data mining. In spite of the dozens of creative solutions proposed for this problem, recent empirical evidence suggests that time series discords, a relatively simple twenty-year old distance-based technique, remains among the state-of-art techniques. While there are many algorithms for computing the time series discords, they all have limitations. First, they are limited to the batch case, whereas the online case is more actionable. Second, these algorithms exhibit poor scalability beyond tens of thousands of datapoints. In this work we introduce DAMP, a novel algorithm that addresses both these issues. DAMP computes exact left-discords on fast arriving streams, at up to 300,000 Hz using a commodity desktop. This allows us to find time series discords in datasets with trillions of datapoints for the first time. We will demonstrate the utility of our algorithm with the most ambitious set of time series anomaly detection experiments ever conducted.


DOI
HAL
Type:
Conference
City:
Washington DC
Date:
2022-08-14
Department:
Data Science
Eurecom Ref:
6945
Copyright:
© ACM, 2022. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in KDD 2022, 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 14-18 August 2022, Washington DC, USA https://doi.org/10.1145/3534678.3539271
See also:

PERMALINK : https://www.eurecom.fr/publication/6945