VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering
- Author
- Bie Verbist (UGent) , Kim Thys, Joke Reumers, Yves Wetzels, Koen Van der Borght, Willem Talloen, Jeroen Aerssens, Lieven Clement (UGent) and Olivier Thas (UGent)
- Organization
- Project
- Abstract
- Motivation: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. Results: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 941.20 KB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-7180112
- MLA
- Verbist, Bie, et al. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS, vol. 31, no. 1, 2015, pp. 94–101, doi:10.1093/bioinformatics/btu587.
- APA
- Verbist, B., Thys, K., Reumers, J., Wetzels, Y., Van der Borght, K., Talloen, W., … Thas, O. (2015). VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering. BIOINFORMATICS, 31(1), 94–101. https://doi.org/10.1093/bioinformatics/btu587
- Chicago author-date
- Verbist, Bie, Kim Thys, Joke Reumers, Yves Wetzels, Koen Van der Borght, Willem Talloen, Jeroen Aerssens, Lieven Clement, and Olivier Thas. 2015. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS 31 (1): 94–101. https://doi.org/10.1093/bioinformatics/btu587.
- Chicago author-date (all authors)
- Verbist, Bie, Kim Thys, Joke Reumers, Yves Wetzels, Koen Van der Borght, Willem Talloen, Jeroen Aerssens, Lieven Clement, and Olivier Thas. 2015. “VirVarSeq: A Low-Frequency Virus Variant Detection Pipeline for Illumina Sequencing Using Adaptive Base-Calling Accuracy Filtering.” BIOINFORMATICS 31 (1): 94–101. doi:10.1093/bioinformatics/btu587.
- Vancouver
- 1.Verbist B, Thys K, Reumers J, Wetzels Y, Van der Borght K, Talloen W, et al. VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering. BIOINFORMATICS. 2015;31(1):94–101.
- IEEE
- [1]B. Verbist et al., “VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering,” BIOINFORMATICS, vol. 31, no. 1, pp. 94–101, 2015.
@article{7180112, abstract = {{Motivation: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. Results: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%.}}, author = {{Verbist, Bie and Thys, Kim and Reumers, Joke and Wetzels, Yves and Van der Borght, Koen and Talloen, Willem and Aerssens, Jeroen and Clement, Lieven and Thas, Olivier}}, issn = {{1367-4803}}, journal = {{BIOINFORMATICS}}, language = {{eng}}, number = {{1}}, pages = {{94--101}}, title = {{VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering}}, url = {{http://doi.org/10.1093/bioinformatics/btu587}}, volume = {{31}}, year = {{2015}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: