Skip to main content
Log in

Ab initio gene identification: Prokaryote genome annotation with GeneScan and GLIMMER

  • Published:
Journal of Biosciences Aims and scope Submit manuscript

Abstract

We compare the annotation of three complete genomes using theab initio methods of gene identification GeneScan and GLIMMER. The annotation given in GenBank, the standard against which these are compared, has been made using GeneMark. We find a number of novel genes which are predicted by both methods used here, as well as a number of genes that are predicted by GeneMark, but are not identified by either of the nonconsensus methods that we have used. The three organisms studied here are all prokaryotic species with fairly compact genomes. The Fourier measure forms the basis for an efficient non-consensus method for gene prediction, and the algorithm GeneScan exploits this measure. We have bench-marked this program as well as GLIMMER using 3 complete prokaryotic genomes. An effort has also been made to study the limitations of these techniques for complete genome analysis. GeneScan and GLIMMER are of comparable accuracy insofar as gene-identification is concerned, with sensitivities and specificities typically greater than 0.9. The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a significant number of cases, similar results are provided by the two techniques. This suggests that there could be some as-yet unidentified additional genes in these three genomes, and also that some of the putative identifications made hitherto might require re-evaluation. All these cases are discussed in detail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Audic S and Claverie J M 1998 Self-identification of protein-coding regions in microbial genomes;Proc. Natl. Acad. Sci. USA 95 10026–10031

    Article  PubMed  CAS  Google Scholar 

  • Bhattacharya A, Sudha Bhattacharyya and John P Ackers 1999 Nontranslated polyadenylated ribonucleic acids from the protozoan parasiteE. histolytica;Curr. Sci. 77 564–567

    CAS  Google Scholar 

  • Bhattacharya A, Bhattacharya S, Joshi A, Ramachandran S and Ramaswamy R 2000 Identification of Parasitic Genes by Computational Methods;Parasitol. Today 16 127–130

    Article  PubMed  CAS  Google Scholar 

  • Borodovsky M and McIninch J 1993 GeneMark: Parallel Gene Recognition for both DNA Strands;Comput. Chem. 17 123–133

    Article  CAS  Google Scholar 

  • Burge C and Karlin S 1997 Prediction of complete gene structures in human genomic DNA;J. Mol. Biol. 268 78–94

    Article  PubMed  CAS  Google Scholar 

  • Burset M and Guigo R 1996 Evaluation of Gene Structure Prediction Programs;Genomics 34 353–367

    Article  PubMed  CAS  Google Scholar 

  • Claverie J M 1997 Computational methods for identification of genes in vertebrate genomic sequences;Hum. Mol. Genet. 6 1735–1744

    Article  PubMed  CAS  Google Scholar 

  • Delcher A L, Hormon D, Kasif S, White O and Salzberg S L 1999 Improved microbial gene identification with GLIMMER;Nucleic Acids Res. 27 4636–4641

    Article  PubMed  CAS  Google Scholar 

  • Dunham Iet al 1999 The DNA sequence of human chromosome 22;Nature (London) 402 489–495

    Article  CAS  Google Scholar 

  • Fickett J W 1996 The gene identification problem: an overview for developers;Comput. Chem. 20 103–118

    Article  PubMed  CAS  Google Scholar 

  • Guigo R 1999 DNA composition, codon usage and exon prediction; inGenetics Databases (ed.) M Bishop (New York: Academic Press) pp 53–80

    Google Scholar 

  • Hattori Met al 2000 The DNA sequence of human chromosome 21;Nature (London) 405 311–319

    Article  CAS  Google Scholar 

  • Lawson D, Bowman S and Bartell B 2000;Nature (London) 404 34–35

    Article  CAS  Google Scholar 

  • Ossadnik S M, Buldyrev S V, Goldberger A L, Harvin S, Mantegna R N, Peng C K, Simons M and Stanley HE 1994 Correlation approach to identify coding regions in DNA sequences;Biophys. J. 67 64–70

    PubMed  CAS  Google Scholar 

  • Parra S, Blanco E and Guigó R 2000 Geneid in Drosophila;Genome Res. 10 511–515

    Article  PubMed  CAS  Google Scholar 

  • Pertea M, Salzberg S L and Gardner M J 2000 Finding genes inPlasmodium falciparum chromosome 3;Nature (London) 404 34

    CAS  Google Scholar 

  • Ramachandran S and Ramakrishna R 1999 Gene identification in bacterial and organellar genomes using GeneScan;Comput. Chem. 23 165–174

    Article  Google Scholar 

  • Tiwari Set al 1997 Prediction of probable genes by Fourier analysis of genomic sequences;CABIOS 13 263–270

    PubMed  CAS  Google Scholar 

  • Uberbacher E C, Xu Y and Mural R J 1996 Discovering and understanding genes in human DNA sequence using GRAIL;Methods Enzymol. 266 259–281

    Article  PubMed  CAS  Google Scholar 

  • Vukimirovic O G and Tilghman S 2000 Exploring Genome Space;Nature (London) 405 820–822

    Article  CAS  Google Scholar 

  • Xu Y and Uberbacher E C 1997 Automated Gene Identification in Large-Scale Genomic Sequences;J. Comput. Biol. 4 325–338

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramakrishna Ramaswamy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aggarwal, G., Ramaswamy, R. Ab initio gene identification: Prokaryote genome annotation with GeneScan and GLIMMER. J Biosci 27, 7–14 (2002). https://doi.org/10.1007/BF02703679

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02703679

Keywords

Navigation