Validation set sampling strategies for predictive process monitoring
- Author
- Jari Peeperkorn, Seppe vanden Broucke (UGent) and Jochen De Weerdt
- Organization
- Abstract
- Previous studies investigating the efficacy of long short-term memory (LSTM) recurrent neural networks in predictive process monitoring and their ability to capture the underlying process structure have raised concerns about their limited ability to generalize to unseen behavior. Event logs often fail to capture the full spectrum of behavior permitted by the underlying processes. To overcome these challenges, this study introduces innovative validation set sampling strategies based on control-flow variant-based resampling. These strategies have undergone extensive evaluation to assess their impact on hyperparameter selection and early stopping, resulting in notable enhancements to the generalization capabilities of trained LSTM models. In addition, this study expands the experimental framework to enable accurate interpretation of underlying process models and provide valuable insights. By conducting experiments with event logs representing process models of varying complexities, this research elucidates the effectiveness of the proposed validation strategies. Furthermore, the extended framework facilitates investigations into the influence of event log completeness on the learning quality of predictive process models. The novel validation set sampling strategies proposed in this study facilitate the development of more effective and reliable predictive process models, ultimately bolstering generalization capabilities and improving the understanding of underlying process dynamics.
- Keywords
- Hardware and Architecture, Information Systems, Software, Process mining, Predictive process monitoring, LSTM, Generalization, Validation, set, Log completeness
Downloads
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 3.35 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01HJB0PTPKBCVMH8VV8HYXMY9R
- MLA
- Peeperkorn, Jari, et al. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS, vol. 121, 2024, doi:10.1016/j.is.2023.102330.
- APA
- Peeperkorn, J., vanden Broucke, S., & De Weerdt, J. (2024). Validation set sampling strategies for predictive process monitoring. INFORMATION SYSTEMS, 121. https://doi.org/10.1016/j.is.2023.102330
- Chicago author-date
- Peeperkorn, Jari, Seppe vanden Broucke, and Jochen De Weerdt. 2024. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS 121. https://doi.org/10.1016/j.is.2023.102330.
- Chicago author-date (all authors)
- Peeperkorn, Jari, Seppe vanden Broucke, and Jochen De Weerdt. 2024. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS 121. doi:10.1016/j.is.2023.102330.
- Vancouver
- 1.Peeperkorn J, vanden Broucke S, De Weerdt J. Validation set sampling strategies for predictive process monitoring. INFORMATION SYSTEMS. 2024;121.
- IEEE
- [1]J. Peeperkorn, S. vanden Broucke, and J. De Weerdt, “Validation set sampling strategies for predictive process monitoring,” INFORMATION SYSTEMS, vol. 121, 2024.
@article{01HJB0PTPKBCVMH8VV8HYXMY9R,
abstract = {{Previous studies investigating the efficacy of long short-term memory (LSTM) recurrent neural networks in predictive process monitoring and their ability to capture the underlying process structure have raised concerns about their limited ability to generalize to unseen behavior. Event logs often fail to capture the full spectrum of behavior permitted by the underlying processes. To overcome these challenges, this study introduces innovative validation set sampling strategies based on control-flow variant-based resampling. These strategies have undergone extensive evaluation to assess their impact on hyperparameter selection and early stopping, resulting in notable enhancements to the generalization capabilities of trained LSTM models. In addition, this study expands the experimental framework to enable accurate interpretation of underlying process models and provide valuable insights. By conducting experiments with event logs representing process models of varying complexities, this research elucidates the effectiveness of the proposed validation strategies. Furthermore, the extended framework facilitates investigations into the influence of event log completeness on the learning quality of predictive process models. The novel validation set sampling strategies proposed in this study facilitate the development of more effective and reliable predictive process models, ultimately bolstering generalization capabilities and improving the understanding of underlying process dynamics.}},
articleno = {{102330}},
author = {{Peeperkorn, Jari and vanden Broucke, Seppe and De Weerdt, Jochen}},
issn = {{0306-4379}},
journal = {{INFORMATION SYSTEMS}},
keywords = {{Hardware and Architecture,Information Systems,Software,Process mining,Predictive process monitoring,LSTM,Generalization,Validation,set,Log completeness}},
language = {{eng}},
pages = {{23}},
title = {{Validation set sampling strategies for predictive process monitoring}},
url = {{http://doi.org/10.1016/j.is.2023.102330}},
volume = {{121}},
year = {{2024}},
}
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: