Advanced search
1 file | 863.00 KB Add to list

Data-driven methods for imputing national-level incidence in global burden of disease studies

Author
Organization
Project
FERG
Abstract
Objective: To develop transparent and reproducible methods for imputing missing data on disease incidence at national-level for the year 2005. Methods: We compared several models for imputing missing country-level incidence rates for two foodborne diseases – congenital toxoplasmosis and aflatoxin-related hepatocellular carcinoma. Missing values were assumed to be missing at random. Predictor variables were selected using least absolute shrinkage and selection operator regression. We compared the predictive performance of naive extrapolation approaches and Bayesian random and mixed-effects regression models. Leave-one-out cross-validation was used to evaluate model accuracy. Findings: The predictive accuracy of the Bayesian mixed-effects models was significantly better than that of the naive extrapolation method for one of the two disease models. However, Bayesian mixed-effects models produced wider prediction intervals for both data sets. Conclusion: Several approaches are available for imputing missing data at national level. Strengths of a hierarchical regression approach for this type of task are the ability to derive estimates from other similar countries, transparency, computational efficiency and ease of interpretation. The inclusion of informative covariates may improve model performance, but results should be appraised carefully.
Keywords
LASSO, SELECTION

Downloads

  • (...).139972
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 863.00 KB

Citation

Please use this url to cite or link to this publication:

MLA
McDonald, Scott A et al. “Data-driven Methods for Imputing National-level Incidence in Global Burden of Disease Studies.” BULLETIN OF THE WORLD HEALTH ORGANIZATION 93.4 (2015): 228–236. Print.
APA
McDonald, S. A., Devleesschauwer, B., Speybroeck, N., Hens, N., Praet, N., Torgerson, P. R., Havelaar, A. H., et al. (2015). Data-driven methods for imputing national-level incidence in global burden of disease studies. BULLETIN OF THE WORLD HEALTH ORGANIZATION, 93(4), 228–236.
Chicago author-date
McDonald, Scott A, Brecht Devleesschauwer, Niko Speybroeck, Niel Hens, Nicolas Praet, Paul R Torgerson, Arie H Havelaar, et al. 2015. “Data-driven Methods for Imputing National-level Incidence in Global Burden of Disease Studies.” Bulletin of the World Health Organization 93 (4): 228–236.
Chicago author-date (all authors)
McDonald, Scott A, Brecht Devleesschauwer, Niko Speybroeck, Niel Hens, Nicolas Praet, Paul R Torgerson, Arie H Havelaar, Felicia Wu, Marlène Tremblay, Ermias W Amene, and Dörte Döpfer. 2015. “Data-driven Methods for Imputing National-level Incidence in Global Burden of Disease Studies.” Bulletin of the World Health Organization 93 (4): 228–236.
Vancouver
1.
McDonald SA, Devleesschauwer B, Speybroeck N, Hens N, Praet N, Torgerson PR, et al. Data-driven methods for imputing national-level incidence in global burden of disease studies. BULLETIN OF THE WORLD HEALTH ORGANIZATION. 2015;93(4):228–36.
IEEE
[1]
S. A. McDonald et al., “Data-driven methods for imputing national-level incidence in global burden of disease studies,” BULLETIN OF THE WORLD HEALTH ORGANIZATION, vol. 93, no. 4, pp. 228–236, 2015.
@article{5927697,
  abstract     = {Objective: To develop transparent and reproducible methods for imputing missing data on disease incidence at national-level for the year 2005.
Methods: We compared several models for imputing missing country-level incidence rates for two foodborne diseases – congenital toxoplasmosis and aflatoxin-related hepatocellular carcinoma. Missing values were assumed to be missing at random. Predictor variables were selected using least absolute shrinkage and selection operator regression. We compared the predictive performance of naive extrapolation approaches and Bayesian random and mixed-effects regression models. Leave-one-out cross-validation was used to evaluate model accuracy.
Findings: The predictive accuracy of the Bayesian mixed-effects models was significantly better than that of the naive extrapolation method for one of the two disease models. However, Bayesian mixed-effects models produced wider prediction intervals for both data sets.
Conclusion: Several approaches are available for imputing missing data at national level. Strengths of a hierarchical regression approach for this type of task are the ability to derive estimates from other similar countries, transparency, computational efficiency and ease of interpretation. The inclusion of informative covariates may improve model performance, but results should be appraised carefully.},
  author       = {McDonald, Scott A and Devleesschauwer, Brecht and Speybroeck, Niko and Hens, Niel and Praet, Nicolas and Torgerson, Paul R and Havelaar, Arie H and Wu, Felicia and Tremblay, Marlène and Amene, Ermias W and Döpfer, Dörte},
  issn         = {0042-9686},
  journal      = {BULLETIN OF THE WORLD HEALTH ORGANIZATION},
  keywords     = {LASSO,SELECTION},
  language     = {eng},
  number       = {4},
  pages        = {228--236},
  title        = {Data-driven methods for imputing national-level incidence in global burden of disease studies},
  url          = {http://dx.doi.org/10.2471/BLT.14.139972},
  volume       = {93},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: