Advanced search
1 file | 2.53 MB Add to list

A comprehensive evaluation of consensus spectrum generation methods in proteomics

(2022) JOURNAL OF PROTEOME RESEARCH. 21(6). p.1566-1574
Author
Organization
Abstract
Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.
Keywords
ProteomeXchange, benchmark, big data, clustering, consensus spectra, mass spectrometry, pride database, spectral libraries

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 2.53 MB

Citation

Please use this url to cite or link to this publication:

MLA
Luo, Xiyang, et al. “A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.” JOURNAL OF PROTEOME RESEARCH, vol. 21, no. 6, American Chemical Society (ACS), 2022, pp. 1566–74, doi:10.1021/acs.jproteome.2c00069.
APA
Luo, X., Bittremieux, W., Griss, J., Deutsch, E. W., Sachsenberg, T., Levitsky, L. I., … Perez-Riverol, Y. (2022). A comprehensive evaluation of consensus spectrum generation methods in proteomics. JOURNAL OF PROTEOME RESEARCH, 21(6), 1566–1574. https://doi.org/10.1021/acs.jproteome.2c00069
Chicago author-date
Luo, Xiyang, Wout Bittremieux, Johannes Griss, Eric W. Deutsch, Timo Sachsenberg, Lev I. Levitsky, Mark V. Ivanov, et al. 2022. “A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.” JOURNAL OF PROTEOME RESEARCH 21 (6): 1566–74. https://doi.org/10.1021/acs.jproteome.2c00069.
Chicago author-date (all authors)
Luo, Xiyang, Wout Bittremieux, Johannes Griss, Eric W. Deutsch, Timo Sachsenberg, Lev I. Levitsky, Mark V. Ivanov, Julia A. Bubis, Ralf Gabriels, Henry Webel, Aniel Sanchez, Mingze Bai, Lukas Käll, and Yasset Perez-Riverol. 2022. “A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.” JOURNAL OF PROTEOME RESEARCH 21 (6): 1566–1574. doi:10.1021/acs.jproteome.2c00069.
Vancouver
1.
Luo X, Bittremieux W, Griss J, Deutsch EW, Sachsenberg T, Levitsky LI, et al. A comprehensive evaluation of consensus spectrum generation methods in proteomics. JOURNAL OF PROTEOME RESEARCH. 2022;21(6):1566–74.
IEEE
[1]
X. Luo et al., “A comprehensive evaluation of consensus spectrum generation methods in proteomics,” JOURNAL OF PROTEOME RESEARCH, vol. 21, no. 6, pp. 1566–1574, 2022.
@article{01GWKXNBJFAWFDPWQG92WXKY74,
  abstract     = {{Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.}},
  author       = {{Luo, Xiyang and Bittremieux, Wout and Griss, Johannes and Deutsch, Eric W. and Sachsenberg, Timo and Levitsky, Lev I. and Ivanov, Mark V. and Bubis, Julia A. and Gabriels, Ralf and Webel, Henry and Sanchez, Aniel and Bai, Mingze and Käll, Lukas and Perez-Riverol, Yasset}},
  issn         = {{1535-3893}},
  journal      = {{JOURNAL OF PROTEOME RESEARCH}},
  keywords     = {{ProteomeXchange,benchmark,big data,clustering,consensus spectra,mass spectrometry,pride database,spectral libraries}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{1566--1574}},
  publisher    = {{American Chemical Society (ACS)}},
  title        = {{A comprehensive evaluation of consensus spectrum generation methods in proteomics}},
  url          = {{http://doi.org/10.1021/acs.jproteome.2c00069}},
  volume       = {{21}},
  year         = {{2022}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: