Advanced search
1 file | 1.45 MB

Summarization vs. peptide-based models in label-free quantitative proteomics : performance, pitfalls, and data analysis guidelines

Ludger Goeminne (UGent) , Andrea Argentini (UGent) , Lennart Martens (UGent) and Lieven Clement (UGent)
(2015) JOURNAL OF PROTEOME RESEARCH. 14(6). p.2457-2465
Author
Organization
Project
Bioinformatics: from nucleotids to networks (N2N)
Abstract
Quantitative label-free mass spectrometry is increasingly used to analyze the proteomes of complex biological samples. However, the choice of appropriate data analysis methods remains a major challenge. We therefore provide a rigorous comparison between peptide-based models and peptide-summarization-based pipelines. We show that peptide-based models outperform summarization-based pipelines in terms of sensitivity, specificity, accuracy, and precision. We also demonstrate that the predefined FDR cutoffs for the detection of differentially regulated proteins can become problematic when differentially expressed (DE) proteins are highly abundant in one or more samples. Care should therefore be taken when data are interpreted from samples with spiked-in internal controls and from samples that contain a few very highly abundant proteins. We do, however, show that specific diagnostic plots can be used for assessing differentially expressed proteins and the overall quality of the obtained fold change estimates. Finally, our study also illustrates that imputation under the "missing by low abundance" assumption is beneficial for the detection of differential expression in proteins with low abundance, but it negatively affects moderately to highly abundant proteins. Hence, imputation strategies that are commonly implemented in standard proteomics software should be used with care.​
Keywords
IDENTIFICATION, STANDARD, SHOTGUN PROTEOMICS, PROTEIN QUANTIFICATION, linear regression models, summarization-based models, proteomics, data analysis guidelines, MASS-SPECTROMETRY

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.45 MB

Citation

Please use this url to cite or link to this publication:

Chicago
Goeminne, Ludger, Andrea Argentini, Lennart Martens, and Lieven Clement. 2015. “Summarization Vs. Peptide-based Models in Label-free Quantitative Proteomics : Performance, Pitfalls, and Data Analysis Guidelines.” Journal of Proteome Research 14 (6): 2457–2465.
APA
Goeminne, L., Argentini, A., Martens, L., & Clement, L. (2015). Summarization vs. peptide-based models in label-free quantitative proteomics : performance, pitfalls, and data analysis guidelines. JOURNAL OF PROTEOME RESEARCH, 14(6), 2457–2465.
Vancouver
1.
Goeminne L, Argentini A, Martens L, Clement L. Summarization vs. peptide-based models in label-free quantitative proteomics : performance, pitfalls, and data analysis guidelines. JOURNAL OF PROTEOME RESEARCH. 2015;14(6):2457–65.
MLA
Goeminne, Ludger, Andrea Argentini, Lennart Martens, et al. “Summarization Vs. Peptide-based Models in Label-free Quantitative Proteomics : Performance, Pitfalls, and Data Analysis Guidelines.” JOURNAL OF PROTEOME RESEARCH 14.6 (2015): 2457–2465. Print.
@article{5928818,
  abstract     = {Quantitative label-free mass spectrometry is increasingly used to analyze the proteomes of complex biological samples. However, the choice of appropriate data analysis methods remains a major challenge. We therefore provide a rigorous comparison between peptide-based models and peptide-summarization-based pipelines. We show that peptide-based models outperform summarization-based pipelines in terms of sensitivity, specificity, accuracy, and precision. We also demonstrate that the predefined FDR cutoffs for the detection of differentially regulated proteins can become problematic when differentially expressed (DE) proteins are highly abundant in one or more samples. Care should therefore be taken when data are interpreted from samples with spiked-in internal controls and from samples that contain a few very highly abundant proteins. We do, however, show that specific diagnostic plots can be used for assessing differentially expressed proteins and the overall quality of the obtained fold change estimates. Finally, our study also illustrates that imputation under the {\textacutedbl}missing by low abundance{\textacutedbl} assumption is beneficial for the detection of differential expression in proteins with low abundance, but it negatively affects moderately to highly abundant proteins. Hence, imputation strategies that are commonly implemented in standard proteomics software should be used with care.\unmatched{200b}},
  author       = {Goeminne, Ludger and Argentini, Andrea and Martens, Lennart and Clement, Lieven},
  issn         = {1535-3893},
  journal      = {JOURNAL OF PROTEOME RESEARCH},
  keyword      = {IDENTIFICATION,STANDARD,SHOTGUN PROTEOMICS,PROTEIN QUANTIFICATION,linear regression models,summarization-based models,proteomics,data analysis guidelines,MASS-SPECTROMETRY},
  language     = {eng},
  number       = {6},
  pages        = {2457--2465},
  title        = {Summarization vs. peptide-based models in label-free quantitative proteomics : performance, pitfalls, and data analysis guidelines},
  url          = {http://dx.doi.org/10.1021/pr501223t},
  volume       = {14},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: