DAIRRy-BLUP: a high-performance computing approach to genomic prediction
- Author
- Arne De Coninck (UGent) , Jan Fostier (UGent) , Steven Maenhout (UGent) and Bernard De Baets (UGent)
- Organization
- Project
- Abstract
- In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression–best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR-BLUP implementation, based on single-trait observations (y), that uses the Average Information algorithm for restricted maximum-likelihood estimation of the variance components. The goal of DAIRRy-BLUP is to enable the analysis of large-scale data sets to provide more accurate estimates of marker effects and breeding values. A distributed-memory framework is required since the dimensionality of the problem, determined by the number of SNP markers, can become too large to be analyzed by a single computing node. Initial results show that DAIRRy-BLUP enables the analysis of very large-scale data sets (up to 1,000,000 individuals and 360,000 SNPs) and indicate that increasing the number of phenotypic and genotypic records has a more significant effect on the prediction accuracy than increasing the density of SNP arrays.
- Keywords
- high-performance computing, RIDGE-REGRESSION, genomic prediction, distributed-memory architecture, simulated data, variance component estimation, SELECTION, INFORMATION, GENETICS, SIMULATION, ALGORITHM
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 737.78 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-5672275
- MLA
- De Coninck, Arne, et al. “DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction.” GENETICS, vol. 197, no. 3, 2014, pp. 813–22, doi:10.1534/genetics.114.163683.
- APA
- De Coninck, A., Fostier, J., Maenhout, S., & De Baets, B. (2014). DAIRRy-BLUP: a high-performance computing approach to genomic prediction. GENETICS, 197(3), 813–822. https://doi.org/10.1534/genetics.114.163683
- Chicago author-date
- De Coninck, Arne, Jan Fostier, Steven Maenhout, and Bernard De Baets. 2014. “DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction.” GENETICS 197 (3): 813–22. https://doi.org/10.1534/genetics.114.163683.
- Chicago author-date (all authors)
- De Coninck, Arne, Jan Fostier, Steven Maenhout, and Bernard De Baets. 2014. “DAIRRy-BLUP: A High-Performance Computing Approach to Genomic Prediction.” GENETICS 197 (3): 813–822. doi:10.1534/genetics.114.163683.
- Vancouver
- 1.De Coninck A, Fostier J, Maenhout S, De Baets B. DAIRRy-BLUP: a high-performance computing approach to genomic prediction. GENETICS. 2014;197(3):813–22.
- IEEE
- [1]A. De Coninck, J. Fostier, S. Maenhout, and B. De Baets, “DAIRRy-BLUP: a high-performance computing approach to genomic prediction,” GENETICS, vol. 197, no. 3, pp. 813–822, 2014.
@article{5672275, abstract = {{In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression–best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR-BLUP implementation, based on single-trait observations (y), that uses the Average Information algorithm for restricted maximum-likelihood estimation of the variance components. The goal of DAIRRy-BLUP is to enable the analysis of large-scale data sets to provide more accurate estimates of marker effects and breeding values. A distributed-memory framework is required since the dimensionality of the problem, determined by the number of SNP markers, can become too large to be analyzed by a single computing node. Initial results show that DAIRRy-BLUP enables the analysis of very large-scale data sets (up to 1,000,000 individuals and 360,000 SNPs) and indicate that increasing the number of phenotypic and genotypic records has a more significant effect on the prediction accuracy than increasing the density of SNP arrays.}}, author = {{De Coninck, Arne and Fostier, Jan and Maenhout, Steven and De Baets, Bernard}}, issn = {{0016-6731}}, journal = {{GENETICS}}, keywords = {{high-performance computing,RIDGE-REGRESSION,genomic prediction,distributed-memory architecture,simulated data,variance component estimation,SELECTION,INFORMATION,GENETICS,SIMULATION,ALGORITHM}}, language = {{eng}}, number = {{3}}, pages = {{813--822}}, title = {{DAIRRy-BLUP: a high-performance computing approach to genomic prediction}}, url = {{http://doi.org/10.1534/genetics.114.163683}}, volume = {{197}}, year = {{2014}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: