Advanced search
1 file | 1.09 MB

BLSSpeller : exhaustive comparative discovery of conserved cis-regulatory elements

Dieter De Witte (UGent) , Jan Van de Velde (UGent) , Dries Decap (UGent) , Michiel Van Bel (UGent) , Pieter Audenaert (UGent) , Piet Demeester (UGent) , Bart Dhoedt (UGent) , Klaas Vandepoele (UGent) and Jan Fostier (UGent)
(2015) BIOINFORMATICS. 31(23). p.3758-3766
Author
Organization
Project
Bioinformatics: from nucleotids to networks (N2N)
Abstract
Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O. sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z. mays.
Keywords
COMPARATIVE GENOMICS, FACTOR-BINDING SITES, SYSTEMATIC DISCOVERY, MOTIF DISCOVERY, GENES, TOOLS, ALIGNMENT, IDENTIFICATION, DNA, PLANT GENOMES

Downloads

  • De Witte et al. 2015 Bioinformatics 31 3758.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 1.09 MB

Citation

Please use this url to cite or link to this publication:

Chicago
De Witte, Dieter, Jan Van de Velde, Dries Decap, Michiel Van Bel, Pieter Audenaert, Piet Demeester, Bart Dhoedt, Klaas Vandepoele, and Jan Fostier. 2015. “BLSSpeller : Exhaustive Comparative Discovery of Conserved Cis-regulatory Elements.” Bioinformatics 31 (23): 3758–3766.
APA
De Witte, D., Van de Velde, J., Decap, D., Van Bel, M., Audenaert, P., Demeester, P., Dhoedt, B., et al. (2015). BLSSpeller : exhaustive comparative discovery of conserved cis-regulatory elements. BIOINFORMATICS, 31(23), 3758–3766.
Vancouver
1.
De Witte D, Van de Velde J, Decap D, Van Bel M, Audenaert P, Demeester P, et al. BLSSpeller : exhaustive comparative discovery of conserved cis-regulatory elements. BIOINFORMATICS. 2015;31(23):3758–66.
MLA
De Witte, Dieter, Jan Van de Velde, Dries Decap, et al. “BLSSpeller : Exhaustive Comparative Discovery of Conserved Cis-regulatory Elements.” BIOINFORMATICS 31.23 (2015): 3758–3766. Print.
@article{7034695,
  abstract     = {Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. 
Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O. sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z. mays.},
  author       = {De Witte, Dieter and Van de Velde, Jan and Decap, Dries and Van Bel, Michiel and Audenaert, Pieter and Demeester, Piet and Dhoedt, Bart and Vandepoele, Klaas and Fostier, Jan},
  issn         = {1367-4803},
  journal      = {BIOINFORMATICS},
  keyword      = {COMPARATIVE GENOMICS,FACTOR-BINDING SITES,SYSTEMATIC DISCOVERY,MOTIF DISCOVERY,GENES,TOOLS,ALIGNMENT,IDENTIFICATION,DNA,PLANT GENOMES},
  language     = {eng},
  number       = {23},
  pages        = {3758--3766},
  title        = {BLSSpeller : exhaustive comparative discovery of conserved cis-regulatory elements},
  url          = {http://dx.doi.org/10.1093/bioinformatics/btv466},
  volume       = {31},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: