Advanced search
1 file | 2.80 MB Add to list

Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins

Dimitri Boeckaerts (UGent) , Michiel Stock (UGent) , Bjorn Criel (UGent) , Hans Gerstmans, Bernard De Baets (UGent) and Yves Briers (UGent)
Author
Organization
Project
Abstract
Nowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.
Keywords
Multidisciplinary

Downloads

  • Predicting bacteriophage hosts.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 2.80 MB

Citation

Please use this url to cite or link to this publication:

MLA
Boeckaerts, Dimitri, et al. “Predicting Bacteriophage Hosts Based on Sequences of Annotated Receptor-Binding Proteins.” SCIENTIFIC REPORTS, vol. 11, 2021, doi:10.1038/s41598-021-81063-4.
APA
Boeckaerts, D., Stock, M., Criel, B., Gerstmans, H., De Baets, B., & Briers, Y. (2021). Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins. SCIENTIFIC REPORTS, 11. https://doi.org/10.1038/s41598-021-81063-4
Chicago author-date
Boeckaerts, Dimitri, Michiel Stock, Bjorn Criel, Hans Gerstmans, Bernard De Baets, and Yves Briers. 2021. “Predicting Bacteriophage Hosts Based on Sequences of Annotated Receptor-Binding Proteins.” SCIENTIFIC REPORTS 11. https://doi.org/10.1038/s41598-021-81063-4.
Chicago author-date (all authors)
Boeckaerts, Dimitri, Michiel Stock, Bjorn Criel, Hans Gerstmans, Bernard De Baets, and Yves Briers. 2021. “Predicting Bacteriophage Hosts Based on Sequences of Annotated Receptor-Binding Proteins.” SCIENTIFIC REPORTS 11. doi:10.1038/s41598-021-81063-4.
Vancouver
1.
Boeckaerts D, Stock M, Criel B, Gerstmans H, De Baets B, Briers Y. Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins. SCIENTIFIC REPORTS. 2021;11.
IEEE
[1]
D. Boeckaerts, M. Stock, B. Criel, H. Gerstmans, B. De Baets, and Y. Briers, “Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins,” SCIENTIFIC REPORTS, vol. 11, 2021.
@article{8687376,
  abstract     = {{Nowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.}},
  articleno    = {{1467}},
  author       = {{Boeckaerts, Dimitri and Stock, Michiel and Criel, Bjorn and Gerstmans, Hans and De Baets, Bernard and Briers, Yves}},
  issn         = {{2045-2322}},
  journal      = {{SCIENTIFIC REPORTS}},
  keywords     = {{Multidisciplinary}},
  language     = {{eng}},
  pages        = {{14}},
  title        = {{Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins}},
  url          = {{http://doi.org/10.1038/s41598-021-81063-4}},
  volume       = {{11}},
  year         = {{2021}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: