Ghent University Academic Bibliography

Advanced

Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences

Nathalie Pavy, Stephane Rombauts UGent, Patrice Déhais UGent, Catherine Mathé UGent, Davuluri VV Ramana, Philippe Leroy and Pierre Rouzé (1999) BIOINFORMATICS. 15(11). p.887-899
abstract
Motivation: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes. Results: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three level's for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software.
Please use this url to cite or link to this publication:
author
organization
year
type
journalArticle (proceedingsPaper)
publication status
published
subject
keyword
IDENTIFICATION, DISCRIMINANT-ANALYSIS, DNA, RECOGNITION, ANNOTATION, PROGRAMS
journal title
BIOINFORMATICS
Bioinformatics
volume
15
issue
11
pages
887 - 899
conference name
2nd Georgia Tech international conference on Bioinformatics, in Silicon Biology, on Sequence, Structure and Function
conference location
Atlanta, GA, USA
conference start
1999-11-11
conference end
1999-11-14
Web of Science type
Article
Web of Science id
000085533100004
ISSN
1367-4803
DOI
10.1093/bioinformatics/15.11.887
language
English
UGent publication?
yes
classification
A1
copyright statement
I have transferred the copyright for this publication to the publisher
id
170695
handle
http://hdl.handle.net/1854/LU-170695
date created
2004-01-14 13:40:00
date last changed
2016-12-19 15:37:57
@article{170695,
  abstract     = {Motivation: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes.
Results: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three level's for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software.},
  author       = {Pavy, Nathalie and Rombauts, Stephane and D{\'e}hais, Patrice and Math{\'e}, Catherine and Ramana, Davuluri VV and Leroy, Philippe and Rouz{\'e}, Pierre},
  issn         = {1367-4803},
  journal      = {BIOINFORMATICS},
  keyword      = {IDENTIFICATION,DISCRIMINANT-ANALYSIS,DNA,RECOGNITION,ANNOTATION,PROGRAMS},
  language     = {eng},
  location     = {Atlanta, GA, USA},
  number       = {11},
  pages        = {887--899},
  title        = {Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences},
  url          = {http://dx.doi.org/10.1093/bioinformatics/15.11.887},
  volume       = {15},
  year         = {1999},
}

Chicago
Pavy, Nathalie, Stephane Rombauts, Patrice Déhais, Catherine Mathé, Davuluri VV Ramana, Philippe Leroy, and Pierre Rouzé. 1999. “Evaluation of Gene Prediction Software Using a Genomic Data Set: Application to Arabidopsis Thaliana Sequences.” Bioinformatics 15 (11): 887–899.
APA
Pavy, N., Rombauts, S., Déhais, P., Mathé, C., Ramana, D. V., Leroy, P., & Rouzé, P. (1999). Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences. BIOINFORMATICS, 15(11), 887–899. Presented at the 2nd Georgia Tech international conference on Bioinformatics, in Silicon Biology, on Sequence, Structure and Function.
Vancouver
1.
Pavy N, Rombauts S, Déhais P, Mathé C, Ramana DV, Leroy P, et al. Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences. BIOINFORMATICS. 1999;15(11):887–99.
MLA
Pavy, Nathalie, Stephane Rombauts, Patrice Déhais, et al. “Evaluation of Gene Prediction Software Using a Genomic Data Set: Application to Arabidopsis Thaliana Sequences.” BIOINFORMATICS 15.11 (1999): 887–899. Print.