Advanced search
1 file | 3.35 MB Add to list

Validation set sampling strategies for predictive process monitoring

Author
Organization
Abstract
Previous studies investigating the efficacy of long short-term memory (LSTM) recurrent neural networks in predictive process monitoring and their ability to capture the underlying process structure have raised concerns about their limited ability to generalize to unseen behavior. Event logs often fail to capture the full spectrum of behavior permitted by the underlying processes. To overcome these challenges, this study introduces innovative validation set sampling strategies based on control-flow variant-based resampling. These strategies have undergone extensive evaluation to assess their impact on hyperparameter selection and early stopping, resulting in notable enhancements to the generalization capabilities of trained LSTM models. In addition, this study expands the experimental framework to enable accurate interpretation of underlying process models and provide valuable insights. By conducting experiments with event logs representing process models of varying complexities, this research elucidates the effectiveness of the proposed validation strategies. Furthermore, the extended framework facilitates investigations into the influence of event log completeness on the learning quality of predictive process models. The novel validation set sampling strategies proposed in this study facilitate the development of more effective and reliable predictive process models, ultimately bolstering generalization capabilities and improving the understanding of underlying process dynamics.
Keywords
Hardware and Architecture, Information Systems, Software, Process mining, Predictive process monitoring, LSTM, Generalization, Validation, set, Log completeness

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 3.35 MB

Citation

Please use this url to cite or link to this publication:

MLA
Peeperkorn, Jari, et al. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS, vol. 121, 2024, doi:10.1016/j.is.2023.102330.
APA
Peeperkorn, J., vanden Broucke, S., & De Weerdt, J. (2024). Validation set sampling strategies for predictive process monitoring. INFORMATION SYSTEMS, 121. https://doi.org/10.1016/j.is.2023.102330
Chicago author-date
Peeperkorn, Jari, Seppe vanden Broucke, and Jochen De Weerdt. 2024. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS 121. https://doi.org/10.1016/j.is.2023.102330.
Chicago author-date (all authors)
Peeperkorn, Jari, Seppe vanden Broucke, and Jochen De Weerdt. 2024. “Validation Set Sampling Strategies for Predictive Process Monitoring.” INFORMATION SYSTEMS 121. doi:10.1016/j.is.2023.102330.
Vancouver
1.
Peeperkorn J, vanden Broucke S, De Weerdt J. Validation set sampling strategies for predictive process monitoring. INFORMATION SYSTEMS. 2024;121.
IEEE
[1]
J. Peeperkorn, S. vanden Broucke, and J. De Weerdt, “Validation set sampling strategies for predictive process monitoring,” INFORMATION SYSTEMS, vol. 121, 2024.
@article{01HJB0PTPKBCVMH8VV8HYXMY9R,
  abstract     = {{Previous studies investigating the efficacy of long short-term memory (LSTM) recurrent neural networks in predictive process monitoring and their ability to capture the underlying process structure have raised concerns about their limited ability to generalize to unseen behavior. Event logs often fail to capture the full spectrum of behavior permitted by the underlying processes. To overcome these challenges, this study introduces innovative validation set sampling strategies based on control-flow variant-based resampling. These strategies have undergone extensive evaluation to assess their impact on hyperparameter selection and early stopping, resulting in notable enhancements to the generalization capabilities of trained LSTM models. In addition, this study expands the experimental framework to enable accurate interpretation of underlying process models and provide valuable insights. By conducting experiments with event logs representing process models of varying complexities, this research elucidates the effectiveness of the proposed validation strategies. Furthermore, the extended framework facilitates investigations into the influence of event log completeness on the learning quality of predictive process models. The novel validation set sampling strategies proposed in this study facilitate the development of more effective and reliable predictive process models, ultimately bolstering generalization capabilities and improving the understanding of underlying process dynamics.}},
  articleno    = {{102330}},
  author       = {{Peeperkorn, Jari and vanden Broucke, Seppe and De Weerdt, Jochen}},
  issn         = {{0306-4379}},
  journal      = {{INFORMATION SYSTEMS}},
  keywords     = {{Hardware and Architecture,Information Systems,Software,Process mining,Predictive process monitoring,LSTM,Generalization,Validation,set,Log completeness}},
  language     = {{eng}},
  pages        = {{23}},
  title        = {{Validation set sampling strategies for predictive process monitoring}},
  url          = {{http://doi.org/10.1016/j.is.2023.102330}},
  volume       = {{121}},
  year         = {{2024}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: