Advanced search
1 file | 513.47 KB Add to list

High-dimensional prediction of binary outcomes in the presence of between-study heterogeneity

Chamberlain Mbah (UGent) , Jan De Neve (UGent) and Olivier Thas (UGent)
Author
Organization
Abstract
Many prediction methods have been proposed in the literature, but most of them ignore heterogeneity between populations. Either only data from a single study or population is available for model building and evaluation, or when data from multiple studies make up the training dataset, studies are pooled before model building. As a result, prediction models might perform less than expected when applied to new subjects from new study populations. We propose a linear method for building prediction models with high-dimensional data from multiple studies. Our method explicitly addresses between-population variability and tends to select predictors that are predictive in most of the study populations. We employ empirical Bayes estimators and hence avoid selection bias during the variable selection process. Simulation results demonstrate that the new method works better than other linear prediction methods that ignore the between-study variability. Our method is developed for classification into two groups.
Keywords
Empirical Bayes, heterogeneity, high-dimensional data, multiple studies, naive Bayes

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 513.47 KB

Citation

Please use this url to cite or link to this publication:

MLA
Mbah, Chamberlain, et al. “High-Dimensional Prediction of Binary Outcomes in the Presence of between-Study Heterogeneity.” STATISTICAL METHODS IN MEDICAL RESEARCH, vol. 28, no. 9, 2019, pp. 2848–67.
APA
Mbah, C., De Neve, J., & Thas, O. (2019). High-dimensional prediction of binary outcomes in the presence of between-study heterogeneity. STATISTICAL METHODS IN MEDICAL RESEARCH, 28(9), 2848–2867.
Chicago author-date
Mbah, Chamberlain, Jan De Neve, and Olivier Thas. 2019. “High-Dimensional Prediction of Binary Outcomes in the Presence of between-Study Heterogeneity.” STATISTICAL METHODS IN MEDICAL RESEARCH 28 (9): 2848–67.
Chicago author-date (all authors)
Mbah, Chamberlain, Jan De Neve, and Olivier Thas. 2019. “High-Dimensional Prediction of Binary Outcomes in the Presence of between-Study Heterogeneity.” STATISTICAL METHODS IN MEDICAL RESEARCH 28 (9): 2848–2867.
Vancouver
1.
Mbah C, De Neve J, Thas O. High-dimensional prediction of binary outcomes in the presence of between-study heterogeneity. STATISTICAL METHODS IN MEDICAL RESEARCH. 2019;28(9):2848–67.
IEEE
[1]
C. Mbah, J. De Neve, and O. Thas, “High-dimensional prediction of binary outcomes in the presence of between-study heterogeneity,” STATISTICAL METHODS IN MEDICAL RESEARCH, vol. 28, no. 9, pp. 2848–2867, 2019.
@article{8571290,
  abstract     = {Many prediction methods have been proposed in the literature, but most of them ignore heterogeneity between populations. Either only data from a single study or population is available for model building and evaluation, or when data from multiple studies make up the training dataset, studies are pooled before model building. As a result, prediction models might perform less than expected when applied to new subjects from new study populations. We propose a linear method for building prediction models with high-dimensional data from multiple studies. Our method explicitly addresses between-population variability and tends to select predictors that are predictive in most of the study populations. We employ empirical Bayes estimators and hence avoid selection bias during the variable selection process. Simulation results demonstrate that the new method works better than other linear prediction methods that ignore the between-study variability. Our method is developed for classification into two groups.},
  author       = {Mbah, Chamberlain and De Neve, Jan and Thas, Olivier},
  issn         = {0962-2802},
  journal      = {STATISTICAL METHODS IN MEDICAL RESEARCH},
  keywords     = {Empirical Bayes,heterogeneity,high-dimensional data,multiple studies,naive Bayes},
  language     = {eng},
  number       = {9},
  pages        = {2848--2867},
  title        = {High-dimensional prediction of binary outcomes in the presence of between-study heterogeneity},
  url          = {http://dx.doi.org/10.1177/0962280218787544},
  volume       = {28},
  year         = {2019},
}

Altmetric
View in Altmetric
Web of Science
Times cited: