Advanced search
1 file | 941.20 KB Add to list

VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering

(2015) BIOINFORMATICS. 31(1). p.94-101
Author
Organization
Project
Abstract
Motivation: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. Results: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 941.20 KB

Citation

Please use this url to cite or link to this publication:

MLA
Verbist, Bie, et al. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS, vol. 31, no. 1, 2015, pp. 94–101, doi:10.1093/bioinformatics/btu587.
APA
Verbist, B., Thys, K., Reumers, J., Wetzels, Y., Van der Borght, K., Talloen, W., … Thas, O. (2015). VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering. BIOINFORMATICS, 31(1), 94–101. https://doi.org/10.1093/bioinformatics/btu587
Chicago author-date
Verbist, Bie, Kim Thys, Joke Reumers, Yves Wetzels, Koen Van der Borght, Willem Talloen, Jeroen Aerssens, Lieven Clement, and Olivier Thas. 2015. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS 31 (1): 94–101. https://doi.org/10.1093/bioinformatics/btu587.
Chicago author-date (all authors)
Verbist, Bie, Kim Thys, Joke Reumers, Yves Wetzels, Koen Van der Borght, Willem Talloen, Jeroen Aerssens, Lieven Clement, and Olivier Thas. 2015. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS 31 (1): 94–101. doi:10.1093/bioinformatics/btu587.
Vancouver
1.
Verbist B, Thys K, Reumers J, Wetzels Y, Van der Borght K, Talloen W, et al. VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering. BIOINFORMATICS. 2015;31(1):94–101.
IEEE
[1]
B. Verbist et al., “VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering,” BIOINFORMATICS, vol. 31, no. 1, pp. 94–101, 2015.
@article{7180112,
  abstract     = {{Motivation: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. 
Results: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.}},
  author       = {{Verbist, Bie and Thys, Kim and Reumers, Joke and Wetzels, Yves and Van der Borght, Koen and Talloen, Willem and Aerssens, Jeroen and Clement, Lieven and Thas, Olivier}},
  issn         = {{1367-4803}},
  journal      = {{BIOINFORMATICS}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{94--101}},
  title        = {{VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering}},
  url          = {{http://doi.org/10.1093/bioinformatics/btu587}},
  volume       = {{31}},
  year         = {{2015}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: