A calibration test for evaluating set-based epistemic uncertainty representations
- Author
- Mira Kristin Jürgens (UGent) , Thomas Mortier (UGent) , Eyke Hüllermeier, Viktor Bengs and Willem Waegeman (UGent)
- Organization
- Project
- Abstract
- The accurate representation of epistemic uncertainty is a challenging yet essential task in machine learning. A widely used representation corresponds to convex sets of probabilistic predictors, also known as credal sets. One popular way of constructing these credal sets is via ensembling or specialized supervised learning methods, where the epistemic uncertainty can be quantified through measures such as the set size or the disagreement among members. In principle, these sets should contain the true data-generating distribution. As a necessary condition for this validity, we adopt the strongest notion of calibration as a proxy. Concretely, we propose a novel statistical test to determine whether there is a convex combination of the set’s predictions that is calibrated in distribution. In contrast to previous methods, our framework allows the convex combination to be instance-dependent, recognizing that different ensemble members may be better calibrated in different regions of the input space. Moreover, we learn this combination via proper scoring rules, which inherently optimize for calibration. Building on differentiable, kernel-based estimators of calibration errors, we introduce a nonparametric testing procedure and demonstrate the benefits of capturing instance-level variability on synthetic and real-world experiments.
- Keywords
- Uncertainty estimation, Calibration, Ensembles, Credal sets, Epistemic uncertainty, RELIABILITY
Downloads
-
publisher version.pdf
- full text (Published version)
- |
- open access
- |
- |
- 5.64 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01K24B5H77EW2RAYVM61NHN0PV
- MLA
- Jürgens, Mira Kristin, et al. “A Calibration Test for Evaluating Set-Based Epistemic Uncertainty Representations.” MACHINE LEARNING, vol. 114, no. 9, 2025, doi:10.1007/s10994-025-06844-8.
- APA
- Jürgens, M. K., Mortier, T., Hüllermeier, E., Bengs, V., & Waegeman, W. (2025). A calibration test for evaluating set-based epistemic uncertainty representations. MACHINE LEARNING, 114(9). https://doi.org/10.1007/s10994-025-06844-8
- Chicago author-date
- Jürgens, Mira Kristin, Thomas Mortier, Eyke Hüllermeier, Viktor Bengs, and Willem Waegeman. 2025. “A Calibration Test for Evaluating Set-Based Epistemic Uncertainty Representations.” MACHINE LEARNING 114 (9). https://doi.org/10.1007/s10994-025-06844-8.
- Chicago author-date (all authors)
- Jürgens, Mira Kristin, Thomas Mortier, Eyke Hüllermeier, Viktor Bengs, and Willem Waegeman. 2025. “A Calibration Test for Evaluating Set-Based Epistemic Uncertainty Representations.” MACHINE LEARNING 114 (9). doi:10.1007/s10994-025-06844-8.
- Vancouver
- 1.Jürgens MK, Mortier T, Hüllermeier E, Bengs V, Waegeman W. A calibration test for evaluating set-based epistemic uncertainty representations. MACHINE LEARNING. 2025;114(9).
- IEEE
- [1]M. K. Jürgens, T. Mortier, E. Hüllermeier, V. Bengs, and W. Waegeman, “A calibration test for evaluating set-based epistemic uncertainty representations,” MACHINE LEARNING, vol. 114, no. 9, 2025.
@article{01K24B5H77EW2RAYVM61NHN0PV,
abstract = {{The accurate representation of epistemic uncertainty is a challenging yet essential task in machine learning. A widely used representation corresponds to convex sets of probabilistic predictors, also known as credal sets. One popular way of constructing these credal sets is via ensembling or specialized supervised learning methods, where the epistemic uncertainty can be quantified through measures such as the set size or the disagreement among members. In principle, these sets should contain the true data-generating distribution. As a necessary condition for this validity, we adopt the strongest notion of calibration as a proxy. Concretely, we propose a novel statistical test to determine whether there is a convex combination of the set’s predictions that is calibrated in distribution. In contrast to previous methods, our framework allows the convex combination to be instance-dependent, recognizing that different ensemble members may be better calibrated in different regions of the input space. Moreover, we learn this combination via proper scoring rules, which inherently optimize for calibration. Building on differentiable, kernel-based estimators of calibration errors, we introduce a nonparametric testing procedure and demonstrate the benefits of capturing instance-level variability on synthetic and real-world experiments.}},
articleno = {{202}},
author = {{Jürgens, Mira Kristin and Mortier, Thomas and Hüllermeier, Eyke and Bengs, Viktor and Waegeman, Willem}},
issn = {{0885-6125}},
journal = {{MACHINE LEARNING}},
keywords = {{Uncertainty estimation,Calibration,Ensembles,Credal sets,Epistemic uncertainty,RELIABILITY}},
language = {{eng}},
number = {{9}},
pages = {{29}},
title = {{A calibration test for evaluating set-based epistemic uncertainty representations}},
url = {{http://doi.org/10.1007/s10994-025-06844-8}},
volume = {{114}},
year = {{2025}},
}
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: