
A broken promise : microbiome differential abundance methods do not control the false discovery rate
- Author
- Stijn Hawinkel (UGent) , Federico Mattiello (UGent) , Luc Bijnens and Olivier Thas (UGent)
- Organization
- Project
- Abstract
- High-throughput sequencing technologies allow easy characterization of the human microbiome, but the statistical methods to analyze microbiome data are still in their infancy. Differential abundance methods aim at detecting associations between the abundances of bacterial species and subject grouping factors. The results of such methods are important to identify the microbiome as a prognostic or diagnostic biomarker or to demonstrate efficacy of prodrug or antibiotic drugs. Because of a lack of benchmarking studies in the microbiome field, no consensus exists on the performance of the statistical methods. We have compared a large number of popular methods through extensive parametric and nonparametric simulation as well as real data shuffling algorithms. The results are consistent over the different approaches and all point to an alarming excess of false discoveries. This raises great doubts about the reliability of discoveries in past studies and imperils reproducibility of microbiome experiments. To further improve method benchmarking, we introduce a new simulation tool that allows to generate correlated count data following any univariate count distribution; the correlation structure may be inferred from real data. Most simulation studies discard the correlation between species, but our results indicate that this correlation can negatively affect the performance of statistical methods.
- Keywords
- microbiome, differential abundance, simulation, taxa correlation networks, false discovery rate, RNA-SEQ, GUT MICROBIOME, NONPARAMETRIC APPROACH, INTESTINAL MICROBIOTA, MICROARRAYS, EXPRESSION, DISEASE, AGE
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 1.23 MB
-
(...).pdf
- supplementary material
- |
- UGent only
- |
- |
- 8.43 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8548584
- MLA
- Hawinkel, Stijn, et al. “A Broken Promise : Microbiome Differential Abundance Methods Do Not Control the False Discovery Rate.” BRIEFINGS IN BIOINFORMATICS, vol. 20, no. 1, 2019, pp. 210–21, doi:10.1093/bib/bbx104.
- APA
- Hawinkel, S., Mattiello, F., Bijnens, L., & Thas, O. (2019). A broken promise : microbiome differential abundance methods do not control the false discovery rate. BRIEFINGS IN BIOINFORMATICS, 20(1), 210–221. https://doi.org/10.1093/bib/bbx104
- Chicago author-date
- Hawinkel, Stijn, Federico Mattiello, Luc Bijnens, and Olivier Thas. 2019. “A Broken Promise : Microbiome Differential Abundance Methods Do Not Control the False Discovery Rate.” BRIEFINGS IN BIOINFORMATICS 20 (1): 210–21. https://doi.org/10.1093/bib/bbx104.
- Chicago author-date (all authors)
- Hawinkel, Stijn, Federico Mattiello, Luc Bijnens, and Olivier Thas. 2019. “A Broken Promise : Microbiome Differential Abundance Methods Do Not Control the False Discovery Rate.” BRIEFINGS IN BIOINFORMATICS 20 (1): 210–221. doi:10.1093/bib/bbx104.
- Vancouver
- 1.Hawinkel S, Mattiello F, Bijnens L, Thas O. A broken promise : microbiome differential abundance methods do not control the false discovery rate. BRIEFINGS IN BIOINFORMATICS. 2019;20(1):210–21.
- IEEE
- [1]S. Hawinkel, F. Mattiello, L. Bijnens, and O. Thas, “A broken promise : microbiome differential abundance methods do not control the false discovery rate,” BRIEFINGS IN BIOINFORMATICS, vol. 20, no. 1, pp. 210–221, 2019.
@article{8548584, abstract = {{High-throughput sequencing technologies allow easy characterization of the human microbiome, but the statistical methods to analyze microbiome data are still in their infancy. Differential abundance methods aim at detecting associations between the abundances of bacterial species and subject grouping factors. The results of such methods are important to identify the microbiome as a prognostic or diagnostic biomarker or to demonstrate efficacy of prodrug or antibiotic drugs. Because of a lack of benchmarking studies in the microbiome field, no consensus exists on the performance of the statistical methods. We have compared a large number of popular methods through extensive parametric and nonparametric simulation as well as real data shuffling algorithms. The results are consistent over the different approaches and all point to an alarming excess of false discoveries. This raises great doubts about the reliability of discoveries in past studies and imperils reproducibility of microbiome experiments. To further improve method benchmarking, we introduce a new simulation tool that allows to generate correlated count data following any univariate count distribution; the correlation structure may be inferred from real data. Most simulation studies discard the correlation between species, but our results indicate that this correlation can negatively affect the performance of statistical methods.}}, author = {{Hawinkel, Stijn and Mattiello, Federico and Bijnens, Luc and Thas, Olivier}}, issn = {{1467-5463}}, journal = {{BRIEFINGS IN BIOINFORMATICS}}, keywords = {{microbiome,differential abundance,simulation,taxa correlation networks,false discovery rate,RNA-SEQ,GUT MICROBIOME,NONPARAMETRIC APPROACH,INTESTINAL MICROBIOTA,MICROARRAYS,EXPRESSION,DISEASE,AGE}}, language = {{eng}}, number = {{1}}, pages = {{210--221}}, title = {{A broken promise : microbiome differential abundance methods do not control the false discovery rate}}, url = {{http://doi.org/10.1093/bib/bbx104}}, volume = {{20}}, year = {{2019}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: