
The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining
- Author
- Dora Szakonyi (UGent) , Sofie Van Landeghem (UGent) , Katja Baerenfaller, Lieven Baeyens (UGent) , Jonas Blomme (UGent) , Rubén Casanova-Sáez, Stefanie De Bodt (UGent) , David Esteve-Bruna, Fabio Fiorani (UGent) , Nathalie Gonzalez Sanchez (UGent) , Jesper Grønlund, Richard GH Immink, Sara Jover-Gil, Asuka Kuwabara, Tamara Muñoz-Nortes, Aalt DJ van Dijk, David Wilson-Sánchez, Vicky Buchanan-Wollaston, Gerco C Angenent, Yves Van de Peer (UGent) , Dirk Inzé (UGent) , José Luis Micol, Wilhelm Gruissem, Sean Walsh and Pierre Hilson (UGent)
- Organization
- Project
- Abstract
- The information that connects genotypes and phenotypes is essentially embedded in research articles written in natural language. To facilitate access to this knowledge, we constructed a framework for the curation of the scientific literature studying the molecular mechanisms that control leaf growth and development in Arabidopsis thaliana (Arabidopsis). Standard structured statements, called relations, were designed to capture diverse data types, including phenotypes and gene expression linked to genotype description, growth conditions, genetic and molecular interactions, and details about molecular entities. Relations were then annotated from the literature, defining the relevant terms according to standard biomedical ontologies. This curation process was supported by a dedicated graphical user interface, called Leaf Knowtator. A total of 283 primary research articles were curated by a community of annotators, yielding 9947 relations monitored for consistency and over 12,500 references to Arabidopsis genes. This information was converted into a relational database (KnownLeaf) and merged with other public Arabidopsis resources relative to transcriptional networks, protein–protein interaction, gene co-expression, and additional molecular annotations. Within KnownLeaf, leaf phenotype data can be searched together with molecular data originating either from this curation initiative or from external public resources. Finally, we built a network (LeafNet) with a portion of the KnownLeaf database content to graphically represent the leaf phenotype relations in a molecular context, offering an intuitive starting point for knowledge mining. Literature curation efforts such as ours provide high quality structured information accessible to computational analysis, and thereby to a wide range of applications. DATA: The presented work was performed in the framework of the AGRON-OMICS project (Arabidopsis GRO wth Network integrating OMICS technologies) supported by European Commission 6th Framework Programme project (Grant number LSHG-CT-2006-037704). This is a data integration and data sharing portal collecting all the all the major results from the consortium. All data presented in our paper is available here. https://agronomics.ethz.ch/.
- Keywords
- Literature curation, Arabidopsis, Leaf growth, Data integration, PROTEIN INTERACTIONS, PLANT ONTOLOGY, GENE, TOOL, TEXT, INFORMATION, DIFFERENTIATION, MAINTENANCE, EXPRESSION, GENERATION
Downloads
-
Szakonyi et al. 2015 Current Plant Biology 2 1.pdf
- full text (Published version)
- |
- open access
- |
- |
- 2.88 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-5930918
- MLA
- Szakonyi, Dora, et al. “The KnownLeaf Literature Curation System Captures Knowledge about Arabidopsis Leaf Growth and Development and Facilitates Integrated Data Mining.” CURRENT PLANT BIOLOGY, vol. 2, 2015, pp. 1–11, doi:10.1016/j.cpb.2014.12.002.
- APA
- Szakonyi, D., Van Landeghem, S., Baerenfaller, K., Baeyens, L., Blomme, J., Casanova-Sáez, R., … Hilson, P. (2015). The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining. CURRENT PLANT BIOLOGY, 2, 1–11. https://doi.org/10.1016/j.cpb.2014.12.002
- Chicago author-date
- Szakonyi, Dora, Sofie Van Landeghem, Katja Baerenfaller, Lieven Baeyens, Jonas Blomme, Rubén Casanova-Sáez, Stefanie De Bodt, et al. 2015. “The KnownLeaf Literature Curation System Captures Knowledge about Arabidopsis Leaf Growth and Development and Facilitates Integrated Data Mining.” CURRENT PLANT BIOLOGY 2: 1–11. https://doi.org/10.1016/j.cpb.2014.12.002.
- Chicago author-date (all authors)
- Szakonyi, Dora, Sofie Van Landeghem, Katja Baerenfaller, Lieven Baeyens, Jonas Blomme, Rubén Casanova-Sáez, Stefanie De Bodt, David Esteve-Bruna, Fabio Fiorani, Nathalie Gonzalez Sanchez, Jesper Grønlund, Richard GH Immink, Sara Jover-Gil, Asuka Kuwabara, Tamara Muñoz-Nortes, Aalt DJ van Dijk, David Wilson-Sánchez, Vicky Buchanan-Wollaston, Gerco C Angenent, Yves Van de Peer, Dirk Inzé, José Luis Micol, Wilhelm Gruissem, Sean Walsh, and Pierre Hilson. 2015. “The KnownLeaf Literature Curation System Captures Knowledge about Arabidopsis Leaf Growth and Development and Facilitates Integrated Data Mining.” CURRENT PLANT BIOLOGY 2: 1–11. doi:10.1016/j.cpb.2014.12.002.
- Vancouver
- 1.Szakonyi D, Van Landeghem S, Baerenfaller K, Baeyens L, Blomme J, Casanova-Sáez R, et al. The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining. CURRENT PLANT BIOLOGY. 2015;2:1–11.
- IEEE
- [1]D. Szakonyi et al., “The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining,” CURRENT PLANT BIOLOGY, vol. 2, pp. 1–11, 2015.
@article{5930918, abstract = {{The information that connects genotypes and phenotypes is essentially embedded in research articles written in natural language. To facilitate access to this knowledge, we constructed a framework for the curation of the scientific literature studying the molecular mechanisms that control leaf growth and development in Arabidopsis thaliana (Arabidopsis). Standard structured statements, called relations, were designed to capture diverse data types, including phenotypes and gene expression linked to genotype description, growth conditions, genetic and molecular interactions, and details about molecular entities. Relations were then annotated from the literature, defining the relevant terms according to standard biomedical ontologies. This curation process was supported by a dedicated graphical user interface, called Leaf Knowtator. A total of 283 primary research articles were curated by a community of annotators, yielding 9947 relations monitored for consistency and over 12,500 references to Arabidopsis genes. This information was converted into a relational database (KnownLeaf) and merged with other public Arabidopsis resources relative to transcriptional networks, protein–protein interaction, gene co-expression, and additional molecular annotations. Within KnownLeaf, leaf phenotype data can be searched together with molecular data originating either from this curation initiative or from external public resources. Finally, we built a network (LeafNet) with a portion of the KnownLeaf database content to graphically represent the leaf phenotype relations in a molecular context, offering an intuitive starting point for knowledge mining. Literature curation efforts such as ours provide high quality structured information accessible to computational analysis, and thereby to a wide range of applications. DATA: The presented work was performed in the framework of the AGRON-OMICS project (Arabidopsis GRO wth Network integrating OMICS technologies) supported by European Commission 6th Framework Programme project (Grant number LSHG-CT-2006-037704). This is a data integration and data sharing portal collecting all the all the major results from the consortium. All data presented in our paper is available here. https://agronomics.ethz.ch/.}}, author = {{Szakonyi, Dora and Van Landeghem, Sofie and Baerenfaller, Katja and Baeyens, Lieven and Blomme, Jonas and Casanova-Sáez, Rubén and De Bodt, Stefanie and Esteve-Bruna, David and Fiorani, Fabio and Gonzalez Sanchez, Nathalie and Grønlund, Jesper and Immink, Richard GH and Jover-Gil, Sara and Kuwabara, Asuka and Muñoz-Nortes, Tamara and van Dijk, Aalt DJ and Wilson-Sánchez, David and Buchanan-Wollaston, Vicky and Angenent, Gerco C and Van de Peer, Yves and Inzé, Dirk and Micol, José Luis and Gruissem, Wilhelm and Walsh, Sean and Hilson, Pierre}}, issn = {{2214-6628}}, journal = {{CURRENT PLANT BIOLOGY}}, keywords = {{Literature curation,Arabidopsis,Leaf growth,Data integration,PROTEIN INTERACTIONS,PLANT ONTOLOGY,GENE,TOOL,TEXT,INFORMATION,DIFFERENTIATION,MAINTENANCE,EXPRESSION,GENERATION}}, language = {{eng}}, pages = {{1--11}}, title = {{The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining}}, url = {{http://doi.org/10.1016/j.cpb.2014.12.002}}, volume = {{2}}, year = {{2015}}, }
- Altmetric
- View in Altmetric