
Implications of Z-normalization in the matrix profile
- Author
- Dieter De Paepe, Diego Nieves Avendano (UGent) and Sofie Van Hoecke (UGent)
- Organization
- Abstract
- Companies are increasingly measuring their products and services, resulting in a rising amount of available time series data, making techniques to extract usable information needed. One state-of-the-art technique for time series is the Matrix Profile, which has been used for various applications including motif/discord discovery, visualizations and semantic segmentation. Internally, the Matrix Profile utilizes the z-normalized Euclidean distance to compare the shape of subsequences between two series. However, when comparing subsequences that are relatively flat and contain noise, the resulting distance is high despite the visual similarity of these subsequences. This property violates some of the assumptions made by Matrix Profile based techniques, resulting in worse performance when series contain flat and noisy subsequences. By studying the properties of the z-normalized Euclidean distance, we derived a method to eliminate this effect requiring only an estimate of the standard deviation of the noise. In this paper we describe various practical properties of the z-normalized Euclidean distance and show how these can be used to correct the performance of Matrix Profile related techniques. We demonstrate our techniques using anomaly detection using a Yahoo! Webscope anomaly dataset, semantic segmentation on the PAMAP2 activity dataset and for data visualization on a UCI activity dataset, all containing real-world data, and obtain overall better results after applying our technique. Our technique is a straightforward extension of the distance calculation in the Matrix Profile and will benefit any derived technique dealing with time series containing flat and noisy subsequences.
- Keywords
- TIME-SERIES, Matrix profile, Time series, Noise, Anomaly detection, Time series segmentation
Downloads
-
DS306 i.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 3.67 MB
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 5.99 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8644995
- MLA
- De Paepe, Dieter, et al. “Implications of Z-Normalization in the Matrix Profile.” Pattern Recognition Applications and Methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers, edited by Maria De Marsico et al., vol. 11996, Springer, 2020, pp. 95–118, doi:10.1007/978-3-030-40014-9_5.
- APA
- De Paepe, D., Nieves Avendano, D., & Van Hoecke, S. (2020). Implications of Z-normalization in the matrix profile. In M. De Marsico, G. Sanniti di Baja, & A. Fred (Eds.), Pattern recognition applications and methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers (Vol. 11996, pp. 95–118). Cham: Springer. https://doi.org/10.1007/978-3-030-40014-9_5
- Chicago author-date
- De Paepe, Dieter, Diego Nieves Avendano, and Sofie Van Hoecke. 2020. “Implications of Z-Normalization in the Matrix Profile.” In Pattern Recognition Applications and Methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers, edited by Maria De Marsico, Gabriella Sanniti di Baja, and Ana Fred, 11996:95–118. Cham: Springer. https://doi.org/10.1007/978-3-030-40014-9_5.
- Chicago author-date (all authors)
- De Paepe, Dieter, Diego Nieves Avendano, and Sofie Van Hoecke. 2020. “Implications of Z-Normalization in the Matrix Profile.” In Pattern Recognition Applications and Methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers, ed by. Maria De Marsico, Gabriella Sanniti di Baja, and Ana Fred, 11996:95–118. Cham: Springer. doi:10.1007/978-3-030-40014-9_5.
- Vancouver
- 1.De Paepe D, Nieves Avendano D, Van Hoecke S. Implications of Z-normalization in the matrix profile. In: De Marsico M, Sanniti di Baja G, Fred A, editors. Pattern recognition applications and methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers. Cham: Springer; 2020. p. 95–118.
- IEEE
- [1]D. De Paepe, D. Nieves Avendano, and S. Van Hoecke, “Implications of Z-normalization in the matrix profile,” in Pattern recognition applications and methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers, Prague, Czech Republic, 2020, vol. 11996, pp. 95–118.
@inproceedings{8644995, abstract = {{Companies are increasingly measuring their products and services, resulting in a rising amount of available time series data, making techniques to extract usable information needed. One state-of-the-art technique for time series is the Matrix Profile, which has been used for various applications including motif/discord discovery, visualizations and semantic segmentation. Internally, the Matrix Profile utilizes the z-normalized Euclidean distance to compare the shape of subsequences between two series. However, when comparing subsequences that are relatively flat and contain noise, the resulting distance is high despite the visual similarity of these subsequences. This property violates some of the assumptions made by Matrix Profile based techniques, resulting in worse performance when series contain flat and noisy subsequences. By studying the properties of the z-normalized Euclidean distance, we derived a method to eliminate this effect requiring only an estimate of the standard deviation of the noise. In this paper we describe various practical properties of the z-normalized Euclidean distance and show how these can be used to correct the performance of Matrix Profile related techniques. We demonstrate our techniques using anomaly detection using a Yahoo! Webscope anomaly dataset, semantic segmentation on the PAMAP2 activity dataset and for data visualization on a UCI activity dataset, all containing real-world data, and obtain overall better results after applying our technique. Our technique is a straightforward extension of the distance calculation in the Matrix Profile and will benefit any derived technique dealing with time series containing flat and noisy subsequences.}}, author = {{De Paepe, Dieter and Nieves Avendano, Diego and Van Hoecke, Sofie}}, booktitle = {{Pattern recognition applications and methods, 8th International Conference, ICPRAM 2019, Prague, Czech Republic, February 19-21, 2019, Revised Selected Papers}}, editor = {{De Marsico, Maria and Sanniti di Baja, Gabriella and Fred, Ana}}, isbn = {{9783030400132}}, issn = {{0302-9743}}, keywords = {{TIME-SERIES,Matrix profile,Time series,Noise,Anomaly detection,Time series segmentation}}, language = {{eng}}, location = {{Prague, Czech Republic}}, pages = {{95--118}}, publisher = {{Springer}}, title = {{Implications of Z-normalization in the matrix profile}}, url = {{http://dx.doi.org/10.1007/978-3-030-40014-9_5}}, volume = {{11996}}, year = {{2020}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: