skip to main content
research-article

Navigating the metric maze: a taxonomy of evaluation metrics for anomaly detection in time series

Published: 18 November 2023 Publication History

Abstract

The field of time series anomaly detection is constantly advancing, with several methods available, making it a challenge to determine the most appropriate method for a specific domain. The evaluation of these methods is facilitated by the use of metrics, which vary widely in their properties. Despite the existence of new evaluation metrics, there is limited agreement on which metrics are best suited for specific scenarios and domains, and the most commonly used metrics have faced criticism in the literature. This paper provides a comprehensive overview of the metrics used for the evaluation of time series anomaly detection methods, and also defines a taxonomy of these based on how they are calculated. By defining a set of properties for evaluation metrics and a set of specific case studies and experiments, twenty metrics are analyzed and discussed in detail, highlighting the unique suitability of each for specific tasks. Through extensive experimentation and analysis, this paper argues that the choice of evaluation metric must be made with care, taking into account the specific requirements of the task at hand.

References

[1]
Abdulaal A, Liu Z, Lancewicki T (2021) Practical approach to asynchronous multivariate time series anomaly detection and localization. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery and data mining. Association for computing machinery, New York, NY, USA, KDD ’21, p 2485-2494,
[2]
Ahmad S, Lavin A, Purdy S et al (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147 www.sciencedirect.com/science/article/pii/S0925231217309864, online Real-Time Learning Strategies for Data Streams
[3]
Ahmed AH, Riegler MA, Hicks SA, et al. (2022) Rcad: Real-time collaborative anomaly detection system for mobile broadband networks. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. Association for computing machinery, New York. KDD ’22, p 2682-2691,
[4]
Audibert J, Michiardi P, Guyard F, et al. (2020) Usad: Unsupervised anomaly detection on multivariate time series. In: Proceedings of the 26th ACM SIGKDD International conference on knowledge discovery and data mining. Association for computing machinery, New York. KDD ’20, p 3395-3404,
[5]
Baireddy S, Desai SR, Mathieson JL, et al. (2021) Spacecraft time-series anomaly detection using transfer learning. In: 2021 IEEE/CVF Conference on computer vision and pattern recognition workshops (CVPRW), pp 1951–1960,
[6]
Baker SG and Pinsky PF A proposed design and analysis for comparing digital and analog mammography J Am Stat Assoc 2001 96 454 421-428
[7]
Bashar MA, Nayak R (2020) Tanogan: Time series anomaly detection with generative adversarial networks. In: 2020 IEEE symposium series on computational intelligence, SSCI 2020, Canberra, December 1-4, 2020. IEEE, pp 1778–1785,
[8]
Berrar DP and Flach PA Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them) Brief Bioinform 2012 13 1 83-97
[9]
Bhatia S, Jain A, Li P, et al. (2021) Mstream: Fast anomaly detection in multi-aspect streams. In: Proceedings of the web conference 2021. Association for computing machinery, New York. WWW ’21, p 3371-3382,
[10]
Braei M, Wagner S (2020) Anomaly detection in univariate time-series: a survey on the state-of-the-art. CoRR abs/2004.00433., arXiv:2004.00433
[11]
Buda TS, Assem H, Xu L (2017) ADE: an ensemble approach for early anomaly detection. In: 2017 IFIP/IEEE symposium on integrated network and service management (IM), Lisbon. May 8-12, 2017. IEEE, pp 442–448,
[12]
Campos D, Kieu T, Guo C, et al. (2021) Unsupervised time series outlier detection with diversity-driven convolutional ensembles. Proc VLDB Endow 15(3):611–623., http://www.vldb.org/pvldb/vol15/p611-chaves.pdf
[13]
Challu C, Jiang P, Wu YN, et al. (2022) Deep generative model with hierarchical latent factors for time series anomaly detection. In: International conference on artificial intelligence and statistics
[14]
Chen R, Shi G, Zhao W, et al. A joint model for IT operation series prediction and anomaly detection Neurocomputing 2021 448 130-139
[15]
Chen Z, Chen D, Yuan Z, et al. Learning graph structures with transformer for multivariate time-series anomaly detection in IOT IEEE Internet Things J 2021 9 9179-9189
[16]
Chen Z, Chen D, Zhang X, et al. Learning graph structures with transformer for multivariate time-series anomaly detection in iot IEEE Internet Things J 2022 9 12 9179-9189
[17]
Chen X, Deng L, Huang F, et al. (2021b) DAEMON: unsupervised anomaly detection and interpretation for multivariate time series. In: 37th IEEE international conference on data engineering, ICDE 2021, Chania. April 19-22, 2021. IEEE, pp 2225–2230,
[18]
Chen T, Liu X, Xia B, et al. (2020) Unsupervised anomaly detection of industrial robots using sliding-window convolutional variational autoencoder. IEEE Access 8:47,072–47,081.,
[19]
Choi K, Yi J, Park C, et al. (2021) Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines. IEEE Access 9:120,043–120,065.
[20]
Chuah MC, Fu F (2007) ECG anomaly detection via time series analysis. In: Thulasiraman P, He X, Xu TL, et al. (eds) Frontiers of high performance computing and networking ISPA 2007 workshops, ISPA 2007 international workshops SSDSN, UPWN, WISH, SGC, ParDMCom, HiPCoMB, and IST-AWSN Niagara Falls. August 28 - September 1, 2007, Proceedings, Lecture Notes in Computer Science, vol 4743. Springer, pp 123–135,
[21]
Dai E, Chen J (2022) Graph-augmented normalizing flows for anomaly detection of multiple time series. ArXiv abs/2202.07857.
[22]
Dai L, Lin T, Liu C, et al. (2021) Sdfvae: Static and dynamic factorized vae for anomaly detection of multivariate cdn kpis. In: Proceedings of the web conference 2021. Association for computing machinery, New York. WWW ’21, p 3076-3086,
[23]
Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Cohen WW, Moore AW (eds) Machine learning. Proceedings of the twenty-third international conference (ICML 2006). Pittsburgh, Pennsylvania, USA, June 25-29, 2006, ACM international conference proceeding series, vol 148. ACM, pp 233–240,
[24]
Deng L, Lian D, Huang Z, et al. Graph convolutional adversarial networks for spatiotemporal anomaly detection IEEE Trans Neural Netw Learn Syst 2022 33 6 2416-2428
[25]
Deng A, Hooi B (2021) Graph neural network-based anomaly detection in multivariate time series. In: Thirty-Fifth AAAI conference on artificial intelligence, AAAI 2021, Thirty-third conference on innovative applications of artificial intelligence, IAAI 2021, The eleventh symposium on educational advances in artificial intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, pp 4027–4035, https://ojs.aaai.org/index.php/AAAI/article/view/16523
[26]
Doshi K, Abudalou S, Yilmaz Y (2022) Reward once, penalize once: Rectifying time series anomaly detection. In: International joint conference on neural networks, IJCNN 2022, Padua, July 18-23, 2022. IEEE, pp 1–8,
[27]
Du B, Sun X, Ye J, et al. Gan-based anomaly detection for multivariate time series using polluted training set IEEE Trans Knowl Data Eng 2021 5 1-1
[28]
Ergen T and Kozat SS Unsupervised anomaly detection with LSTM neural networks IEEE Trans Neural Netw Learn Syst 2020 31 8 3127-3141
[29]
Feng Y, Liu Z, Chen J, et al. Unsupervised multimodal anomaly detection with missing sources for liquid rocket engine IEEE Trans Neural Netw Learn Syst 2022 9 1-15
[30]
Feng C, Tian P (2021) Time series anomaly detection for cyber-physical systems via neural system identification and bayesian filtering. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery and data mining. Association for computing machinery, New York. KDD ’21, p 2858-2867,
[31]
Flaborea A, Prenkaj B, Munjal B, et al. (2022) Are we certain it’s anomalous? ArXiv abs/2211.09224.
[32]
Garg A, Zhang W, Samaran J, et al. An evaluation of anomaly detection and diagnosis in multivariate time series IEEE Trans Neural Netw Learn Syst 2022 33 6 2508-2517
[33]
Geiger A, Liu D, Alnegheimish S, et al. (2020) Tadgan: Time series anomaly detection using generative adversarial networks. In: Wu X, Jermaine C, Xiong L, et al. (eds) 2020 IEEE international conference on big data (IEEE BigData 2020), Atlanta, GA, USA, December 10-13, 2020. IEEE, pp 33–43,
[34]
Gensler A, Sick B (2014) Novel criteria to measure performance of time series segmentation techniques. In: Seidl T, Hassani M, Beecks C (eds) Proceedings of the 16th LWA Workshops: KDML, IR and FGWM, Aachen, Germany, September 8-10, 2014, CEUR workshop proceedings, vol 1226. CEUR-WS.org, pp 193–204, http://ceur-ws.org/Vol-1226/paper31.pdf
[35]
Goodge A, Hooi B, Ng S, et al. (2020) Robustness of autoencoders for anomaly detection under adversarial impact. In: Bessiere C (ed) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI 2020. ijcai.org, pp 1244–1250,
[36]
Goswami M, Challu C, Callot L, et al. (2022) Unsupervised model selection for time-series anomaly detection. ArXiv abs/2210.01078.
[37]
Han S, Woo SS (2022) Learning sparse latent graph representations for anomaly detection in multivariate time series. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. Association for computing machinery, New York. KDD ’22, p 2977-2986,
[38]
He Y and Zhao J Temporal convolutional networks for anomaly detection in time series J Phys Conf Ser 2019 4 1213
[39]
He Z, Chen P, Li X, et al. A spatiotemporal deep learning approach for unsupervised anomaly detection in cloud systems IEEE Trans Neural Netw Learn Syst 2020 12 3027736
[40]
Hsieh RJ, Chou J, Ho CH (2019) Unsupervised online anomaly detection on multivariate sensing time series data for smart manufacturing. 2019 IEEE 12th conference on service-oriented computing and applications (SOCA) pp 90–97.
[41]
Huang T, Chen P, Li R (2022) A semi-supervised vae based active anomaly detection framework in multivariate time series for online systems. In: Proceedings of the ACM web conference 2022. Association for computing machinery. New York. WWW ’22, p 1797-1806,
[42]
Huang X, Lee J, Kwon YW, et al. (2020) Crowdquake: A networked system of low-cost sensors for earthquake detection via deep learning. Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining
[43]
Huet A, Navarro JM, Rossi D (2022) Local evaluation of time series anomaly detection algorithms. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. Association for computing machinery. New York. KDD ’22, p 635-645,
[44]
Hundman K, Constantinou V, Laporte C, et al. (2018) Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In: Guo Y, Farooq F (eds) Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2018. London. August 19-23, 2018. ACM, pp 387–395,
[45]
Hwang WS, Yun JH, Kim J, et al. (2022) "do you know existing accuracy metrics overrate time-series anomaly detections?". In: Proceedings of the 37th ACM/SIGAPP symposium on applied computing. Association for computing machinery. New York, SAC ’22, p 403-412,
[46]
Hwang W, Yun J, Kim J, et al. (2019) Time-series aware precision and recall for anomaly detection: Considering variety of detection result and addressing ambiguous labeling. In: Zhu W, Tao D, Cheng X, et al. (eds) Proceedings of the 28th ACM international conference on information and knowledge management, CIKM 2019. Beijing, China, November 3-7, 2019. ACM, pp 2241–2244,
[47]
Jacob V, Song F, Stiegler A, et al. (2021) Exathlon: A benchmark for explainable anomaly detection over time series. Proc VLDB Endow 14(11), 2613–2626. https://doi.org/10.14778/3476249.3476307
[48]
Keogh EJ, Lin J, Fu AWC, et al. Finding unusual medical time-series subsequences: algorithms and applications IEEE Trans Inf Technol Biomed 2006 10 429-439
[49]
Kieu T, Yang B, Guo C, et al. (2019) Outlier detection for time series with recurrent autoencoder ensembles. In: International joint conference on artificial intelligence,
[50]
Kim GY, Lim SM, and Euom IC A study on performance metrics for anomaly detection based on industrial control system operation data Electronics 2022 11 8 1108213
[51]
Kim S, Choi K, Choi H, et al. (2022b) Towards a rigorous evaluation of time-series anomaly detection. In: Thirty-sixth AAAI conference on artificial intelligence, AAAI 2022, Thirty-fourth conference on innovative applications of artificial intelligence, IAAI 2022, The twelveth symposium on educational advances in artificial intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022. AAAI Press, pp 7194–7201, https://ojs.aaai.org/index.php/AAAI/article/view/20680
[52]
Kovács G, Sebestyen G, and Hangan A Evaluation metrics for anomaly detection algorithms in time-series Acta Univ Sapientiae Inf 2019 11 113-130
[53]
Lai K, Zha D, Xu J, et al. (2021) Revisiting time series outlier detection: Definitions and benchmarks. In: Vanschoren J, Yeung S (eds) Proceedings of the neural information processing systems track on datasets and benchmarks 1, NeurIPS datasets and benchmarks 2021, December 2021, virtual, https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/ec5decca5ed3d6b8079e2e7e7bacc9f2-Abstract-round1.html
[54]
Lavin A, Ahmad S (2015a) Evaluating real-time anomaly detection algorithms - the numenta anomaly benchmark. In: Li T, Kurgan LA, Palade V, et al. (eds) 14th IEEE international conference on machine learning and applications, ICMLA 2015, Miami. December 9-11, 2015. IEEE, pp 38–44,
[55]
Lavin A, Ahmad S (2015b) The numenta anomaly benchmark [White paper]. Redwood City, CA: Numenta, Available: https://github.com/numenta/NAB/wiki
[56]
Li L, Yan J, Wang H, et al. Anomaly detection of time series with smoothness-inducing sequential variational auto-encoder IEEE Trans Neural Netw Learn Syst 2021 32 3 1177-1191
[57]
Li Y, Peng X, Zhang J, et al. Dct-gan: dilated convolutional transformer-based gan for time series anomaly detection IEEE Trans Knowl Data Eng 2021 23 1-1
[58]
Li L, Yan J, Wen Q, et al. Learning robust deep state space for unsupervised anomaly detection in contaminated time-series IEEE Trans Knowl Data Eng 2022 23 1-1
[59]
Li D, Chen D, Shi L, et al. (2019) Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. In: International conference on artificial neural networks
[60]
Liu S, Zhou B, Ding QX, et al. Time series anomaly detection with adversarial reconstruction networks IEEE Trans Knowl Data Eng 2022
[61]
Li Z, Zhao Y, Han J, et al. (2021c) Multivariate time series anomaly detection and interpretation using hierarchical inter-metric and temporal embedding. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery and data mining. association for computing machinery, New York. KDD ’21, p 3220-3230,
[62]
Lobo JM, Jiménez-Valverde A, and Real R Auc: a misleading measure of the performance of predictive distribution models Glob Ecol Biogeogr 2008 17 145-151
[63]
Mamandipoor B, Majd M, Sheikhalishahi S, et al. Monitoring and detecting faults in wastewater treatment plants using deep learning Environ Monitor Assess 2020 192 1-12
[64]
Ma M, Zhang S, Chen J, et al. (2021) Jump-starting multivariate time series anomaly detection for online service systems. In: USENIX annual technical conference, https://www.usenix.org/conference/atc21/presentation/ma
[65]
Meng H, Zhang Y, Li Y, et al. (2020) Spacecraft anomaly detection via transformer reconstruction error. In: Jing Z (ed) Proceedings of the international conference on aerospace system science and engineering 2019. Springer, Singapore, pp 351–362,
[66]
Nalepa J, Myller M, Andrzejewski J et al (2022) Evaluating algorithms for anomaly detection in satellite telemetry data. Acta Astronautica 198:689–701 www.sciencedirect.com/science/article/pii/S0094576522003162
[67]
Niu Z, Yu K, and Wu X Lstm-based vae-gan for time-series anomaly detection Sens Basel Switz 2020 20 3738
[68]
Pang G, Shen C, van den Hengel A (2019) Deep anomaly detection with deviation networks. Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining
[69]
Paparrizos J, Boniol P, Palpanas T, et al. (2022a) Volume under the surface: A new accuracy evaluation measure for time-series anomaly detection. Proc VLDB Endow 15:2774–2787. https://doi.org/10.14778/3551793.3551830
[70]
Paparrizos J, Kang Y, Boniol P, et al. (2022b) Tsb-uad: An end-to-end benchmark suite for univariate time-series anomaly detection. Proc VLDB Endow 15(8):1697-1711.
[71]
Park D, Hoshi Y, and Kemp CC A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder IEEE Robot Autom Lett 2017 3 1544-1551
[72]
Pedregosa F, Varoquaux G, Gramfort A, et al. (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830.
[73]
Ren H, Xu B, Wang Y, et al. (2019) Time-series anomaly detection service at microsoft. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining. Association for computing machinery. New York. KDD ’19, p 3009-3017,
[74]
Rewicki F, Denzler J, Niebling J (2022) Is it worth it? an experimental comparison of six deep- and classical machine learning methods for unsupervised anomaly detection in time series. ArXiv abs/2212.11080.
[75]
Saito T and Rehmsmeier M The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets PLoS ONE 2015 10
[76]
Scharwächter E, Müller E (2020) Statistical Evaluation of Anomaly Detectors for Sequences. In: 6th ACM SIGKDD workshop on mining and learning from time series (KDD MiLeTS 2020),
[77]
Schmidl S, Wenig P, Papenbrock T (2022) Anomaly detection in time series: a comprehensive evaluation. Proc VLDB Endow 15(9):1779-1797.,
[78]
Shen L, Li Z, Kwok J (2020) Timeseries anomaly detection using temporal hierarchical one-class network. In: Larochelle H, Ranzato M, Hadsell R, et al. (eds) Advances in neural information processing systems, vol 33. curran associates, Inc., pp 13,016–13,026, https://proceedings.neurips.cc/paper/2020/file/97e401a02082021fd24957f852e0e475-Paper.pdf
[79]
Sivaraks H and Ratanamahatana C Robust and accurate anomaly detection in ecg artifacts using time series motif discovery Comput Math Methods Med 2015 2015 45314
[80]
Su Y, Zhao Y, Niu C, et al. (2019) Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining. Association for computing machinery. New York. KDD ’19, p 2828-2837,
[81]
Tatbul N, Lee TJ, Zdonik S, et al. (2018) Precision and recall for time series. In: Bengio S, Wallach HM, Larochelle H, et al. (eds) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018. NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp 1924–1934, https://proceedings.neurips.cc/paper/2018/hash/8f468c873a32bb0619eaeb2050ba45d1-Abstract.html
[82]
Tuli S, Casale G, Jennings NR (2022) Tranad: deep transformer networks for anomaly detection in multivariate time series data. Proc VLDB Endow 15:1201–1214.
[83]
Wang Y, Han L, Liu W, et al. Study on wavelet neural network based anomaly detection in ocean observing data series Ocean Eng 2019
[84]
Wang X, Pi D, Zhang X, et al. Variational transformer-based anomaly detection approach for multivariate time series Measurement 2022
[85]
Wang Y, Du X, Lu Z, et al. Improved lstm-based time-series anomaly detection in rail transit operation environments IEEE Trans Indust Inform 2022 18 9027-9036
[86]
Wu R, Keogh EJ (2021) Ucr_anomalydatasets.pptx, supplemental material to the ucr anomaly archive. https://www.cs.ucr.edu/%7Eeamonn/time_series_data_2018/UCR_TimeSeriesAnomalyDatasets2021.zip, accessed: 2022-11-15
[87]
Wu R, Keogh EJ (2022) Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress (extended abstract). In: 2022 IEEE 38th international conference on data engineering (ICDE), pp 1479–1480,
[88]
Xu H, Chen W, Zhao N, et al. (2018) Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. In: Proceedings of the 2018 world wide web conference. International world wide web conferences steering committee, republic and canton of Geneva. CHE, WWW ’18, p 187-196,
[89]
Xu H, Wang Y, Jian S, et al. (2022) Calibrated one-class classification for unsupervised time series anomaly detection. CoRR abs/2207.12201.,
[90]
Zhang CK, Li SZ, Zhang H, et al. (2020) Velc: A new variational autoencoder based model for time series anomaly detection. arXiv:1907.01702
[91]
Zhang M, Li T, Shi H, et al. (2019) A decomposition approach for urban anomaly detection across spatiotemporal data. In: Kraus S (ed) Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao. August 10-16, 2019. ijcai.org, pp 6043–6049,
[92]
Zhang C, Song D, Chen Y, et al. (2018) A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. ArXiv abs/1811.08055.
[93]
Zhang J, Wu D, Boulet B (2021) Time series anomaly detection for smart grids: A survey. 2021 IEEE electrical power and energy conference (EPEC) pp 125–130.
[94]
Zhao H, Wang Y, Duan J, et al. (2020) Multivariate time-series anomaly detection via graph attention network. In: 2020 IEEE international conference on data mining (ICDM), pp 841–850,
[95]
Zhou B, Liu S, Hooi B, et al. (2019) Beatgan: Anomalous rhythm detection using adversarially generated time series. In: International joint conference on artificial intelligence,

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery  Volume 38, Issue 3
May 2024
732 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 18 November 2023
Accepted: 24 October 2023
Received: 20 February 2023

Author Tags

  1. Time series
  2. Anomaly detection
  3. Evaluation
  4. Taxonomy

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media