Advanced search
1 file | 890.84 KB Add to list

Explainability in transformer models for functional genomics

Jim Clauwaert (UGent) , Gerben Menschaert (UGent) and Willem Waegeman (UGent)
Author
Organization
Project
Abstract
The effectiveness of deep learning methods can be largely attributed to the automated extraction of relevant features from raw data. In the field of functional genomics, this generally concerns the automatic selection of relevant nucleotide motifs from DNA sequences. To benefit from automated learning methods, new strategies are required that unveil the decision-making process of trained models. In this paper, we present a new approach that has been successful in gathering insights on the transcription process in Escherichia coli. This work builds upon a transformer-based neural network framework designed for prokaryotic genome annotation purposes. We find that the majority of subunits (attention heads) of the model are specialized towards identifying transcription factors and are able to successfully characterize both their binding sites and consensus sequences, uncovering both well-known and potentially novel elements involved in the initiation of the transcription process. With the specialization of the attention heads occurring automatically, we believe transformer models to be of high interest towards the creation of explainable neural networks in this field.
Keywords
Molecular Biology, Information Systems, interpretable neural networks, transformers, functional genomics, DNA-binding sites, COLI RNA-POLYMERASE, ESCHERICHIA-COLI, PROMOTER RECOGNITION, TRANSCRIPTION FACTOR, BINDING, DNA, ALGORITHM, SUBUNIT, ELEMENT, SPACER

Downloads

  • bbab060.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 890.84 KB

Citation

Please use this url to cite or link to this publication:

MLA
Clauwaert, Jim, et al. “Explainability in Transformer Models for Functional Genomics.” BRIEFINGS IN BIOINFORMATICS, vol. 22, no. 5, 2021, doi:10.1093/bib/bbab060.
APA
Clauwaert, J., Menschaert, G., & Waegeman, W. (2021). Explainability in transformer models for functional genomics. BRIEFINGS IN BIOINFORMATICS, 22(5). https://doi.org/10.1093/bib/bbab060
Chicago author-date
Clauwaert, Jim, Gerben Menschaert, and Willem Waegeman. 2021. “Explainability in Transformer Models for Functional Genomics.” BRIEFINGS IN BIOINFORMATICS 22 (5). https://doi.org/10.1093/bib/bbab060.
Chicago author-date (all authors)
Clauwaert, Jim, Gerben Menschaert, and Willem Waegeman. 2021. “Explainability in Transformer Models for Functional Genomics.” BRIEFINGS IN BIOINFORMATICS 22 (5). doi:10.1093/bib/bbab060.
Vancouver
1.
Clauwaert J, Menschaert G, Waegeman W. Explainability in transformer models for functional genomics. BRIEFINGS IN BIOINFORMATICS. 2021;22(5).
IEEE
[1]
J. Clauwaert, G. Menschaert, and W. Waegeman, “Explainability in transformer models for functional genomics,” BRIEFINGS IN BIOINFORMATICS, vol. 22, no. 5, 2021.
@article{8721764,
  abstract     = {{The effectiveness of deep learning methods can be largely attributed to the automated extraction of relevant features from raw data. In the field of functional genomics, this generally concerns the automatic selection of relevant nucleotide motifs from DNA sequences. To benefit from automated learning methods, new strategies are required that unveil the decision-making process of trained models. In this paper, we present a new approach that has been successful in gathering insights on the transcription process in Escherichia coli. This work builds upon a transformer-based neural network framework designed for prokaryotic genome annotation purposes. We find that the majority of subunits (attention heads) of the model are specialized towards identifying transcription factors and are able to successfully characterize both their binding sites and consensus sequences, uncovering both well-known and potentially novel elements involved in the initiation of the transcription process. With the specialization of the attention heads occurring automatically, we believe transformer models to be of high interest towards the creation of explainable neural networks in this field.}},
  articleno    = {{bbab060}},
  author       = {{Clauwaert, Jim and Menschaert, Gerben and Waegeman, Willem}},
  issn         = {{1467-5463}},
  journal      = {{BRIEFINGS IN BIOINFORMATICS}},
  keywords     = {{Molecular Biology,Information Systems,interpretable neural networks,transformers,functional genomics,DNA-binding sites,COLI RNA-POLYMERASE,ESCHERICHIA-COLI,PROMOTER RECOGNITION,TRANSCRIPTION FACTOR,BINDING,DNA,ALGORITHM,SUBUNIT,ELEMENT,SPACER}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{11}},
  title        = {{Explainability in transformer models for functional genomics}},
  url          = {{http://doi.org/10.1093/bib/bbab060}},
  volume       = {{22}},
  year         = {{2021}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: