Advanced search
1 file | 2.04 MB

The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis

Sofie Van Landeghem (UGent) , Stefanie De Bodt (UGent) , Zuzanna Drebert (UGent) , Dirk Inzé (UGent) and Yves Van de Peer (UGent)
(2013) PLANT CELL. 25(3). p.794-807
Author
Organization
Project
Biotechnology for a sustainable economy (Bio-Economy)
Project
Bioinformatics: from nucleotids to networks (N2N)
Abstract
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.
Keywords
COEXPRESSION, TOOL, THALIANA, VISUALIZATION, ROLES, GENES, MOLECULAR INTERACTION DATABASE, ASYMMETRIC LEAVES1, SHARED TASK, LEAF DEVELOPMENT

Downloads

  • Van Landeghem et al. 2013 Plant Cell 25 794.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 2.04 MB

Citation

Please use this url to cite or link to this publication:

Chicago
Van Landeghem, Sofie, Stefanie De Bodt, Zuzanna Drebert, Dirk Inzé, and Yves Van de Peer. 2013. “The Potential of Text Mining in Data Integration and Network Biology for Plant Research : a Case Study on Arabidopsis.” Plant Cell 25 (3): 794–807.
APA
Van Landeghem, S., De Bodt, S., Drebert, Z., Inzé, D., & Van de Peer, Y. (2013). The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis. PLANT CELL, 25(3), 794–807.
Vancouver
1.
Van Landeghem S, De Bodt S, Drebert Z, Inzé D, Van de Peer Y. The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis. PLANT CELL. 2013;25(3):794–807.
MLA
Van Landeghem, Sofie, Stefanie De Bodt, Zuzanna Drebert, et al. “The Potential of Text Mining in Data Integration and Network Biology for Plant Research : a Case Study on Arabidopsis.” PLANT CELL 25.3 (2013): 794–807. Print.
@article{3233818,
  abstract     = {Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.},
  author       = {Van Landeghem, Sofie and De Bodt, Stefanie and Drebert, Zuzanna and Inz{\'e}, Dirk and Van de Peer, Yves},
  issn         = {1040-4651},
  journal      = {PLANT CELL},
  keyword      = {COEXPRESSION,TOOL,THALIANA,VISUALIZATION,ROLES,GENES,MOLECULAR INTERACTION DATABASE,ASYMMETRIC LEAVES1,SHARED TASK,LEAF DEVELOPMENT},
  language     = {eng},
  number       = {3},
  pages        = {794--807},
  title        = {The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis},
  url          = {http://dx.doi.org/10.1105/tpc.112.108753},
  volume       = {25},
  year         = {2013},
}

Altmetric
View in Altmetric
Web of Science
Times cited: