Trends in Biotechnology
Volume 26, Issue 11, November 2008, Pages 602-611
Journal home page for Trends in Biotechnology

Review
Single-molecule DNA sequencing technologies for future genomics research

https://doi.org/10.1016/j.tibtech.2008.07.003Get rights and content

During the current genomics revolution, the genomes of a large number of living organisms have been fully sequenced. However, with the advent of new sequencing technologies, genomics research is now at the threshold of a second revolution. Several second-generation sequencing platforms became available in 2007, but a further revolution in DNA resequencing technologies is being witnessed in 2008, with the launch of the first single-molecule DNA sequencer (Helicos Biosciences), which has already been used to resequence the genome of the M13 virus. This review discusses several single-molecule sequencing technologies that are expected to become available during the next few years and explains how they might impact on genomics research.

Introduction

Since the onset of genomics research in the mid-1990s, whole-genome sequencing has been undertaken for a large number of prokaryotes and eukaryotes. However, the initial enthusiasm and euphoria for whole-genome sequencing activity has now given way to a demand for large-scale whole-genome resequencing (see Glossary) or the sequencing of target regions and/or metagenomes and pan-genomes (see Glossary), which require an increased sequencing speed and reduced costs 1, 2. With this in mind, in 2004, the National Human Genome Research Institute of the National Institutes of Health (NIH–NHGRI) announced a total of US$70 million in grant awards for the development of DNA sequencing (see Glossary) technologies that would reduce the cost of sequencing the human genome from US$ 3 × 109, the amount spent on the public Human Genome Project, to US$103 by 2014 (www.genome.gov/12513210). In October 2006, the X Prize Foundation (Santa Monica, CA, USA) announced a US$10 million ‘Archon X Prize for Genomics’ to the first private effort that could sequence 100 human genomes in 10 days for less than US$10 000 per genome (http://genomics.xprize.org/genomics/archon-x-prize-for-genomics). These incentives have contributed to an explosion of research activity in the development of new DNA sequencing technologies.

Further impetus for developing new DNA sequencing technologies came from the emerging field of personal genomics (see Glossary), which aims to study variations in the genomes of individual humans. To date, only a few personal genomes have been fully sequenced, and the genome sequences of Craig Venter and James Watson have been the only ones published 3, 4. Nevertheless, initiatives are underway to further increase the freely available sequence data for a large number of individual human genomes, as proposed in the Personal Genome Project and other similar projects launched recently (Box 1) 5, 6.

To meet the increased sequencing demands, several non-Sanger ultra-high-throughput sequencing systems became commercially available in 2007 7, 8, 9, 10, 11, 12 (for non-Sanger sequencing systems, see Glossary). These were described as ‘second generation’ or ‘next generation’ sequencing systems, and included the following: Genome Sequencer 20/FLX (commercialized by 454/Roche); ‘Solexa 1G’ (later named ‘Genome Analyzer’ and commercialized by Illiumina/Solexa); SOLiD™ system (commercialized by Applied Biosystems); and Polonator G.007 (commercialized by Dover Systems). These developments have significantly reduced the cost of sequencing – from 1 cent for 10 bases to 1 cent for 1000 bases – and have simultaneously yielded an increase in DNA sequencing speed. As a result, several commercial genotyping services [e.g. Knome, DeCode, 23andMe and Navigenics (http://www.technologyreview.com/Biotech/20926/)] have also appeared on the market. These companies provide services to examine the genome of a person at as many as one million sites for US$1,000 to US$2,500. Future services might include the sequencing of whole genomes of individual humans, if there is a demand. However, there is also an attempt to regulate these gene tests, particularly in the states of New York and California, where gene test firms are being asked to obtain permits and conduct these tests only on the advice of a physician (http://tinyurl.com/55zzk8; http://tinyurl.com/5qgnr9).

Another non-Sanger DNA sequencing approach is the use of single-molecule sequencing (SMS), which only became available in 2008. This approach has been described as a ‘third generation’ or ‘next-next generation’ sequencing technology. It is anticipated that SMS will be much faster and cheaper, so that researchers in the near future should be able to pursue new scientific enquiries that are currently not possible owing to the prohibitive cost of sequencing. Whether or not SMS will fulfill this promise is debatable; compared with the earlier methods mentioned above, a significant reduction in sequencing costs has not yet been demonstrated for the only commercially available SMS technology, which is provided by Helicos Biosciences (Table 1).

Nevertheless, additional major efforts are underway to develop novel SMS technologies, which will be discussed below. In addition, much work is being done to address the outstanding issues that have become apparent during the development of these second- and third- generation sequencing technologies. Most of these outstanding issues (e.g. short read-lengths, higher error-rates, and the difficulty of managing massive amounts of data) are actually common to both second- and third-generation sequencing technologies, and bioinformatics (see Glossary) tools are being developed to deal with them. However, this article focuses on the third-generation SMS technologies, the so-called ‘next-next generation’. These systems will probably constitute a significant fraction of future genomics research efforts, mainly because they will significantly reduce the cost and effort of sequencing in comparison with second-generation technologies – despite the fact that cost reductions continue to be made for second-generation approaches. SMS technology should not necessarily be considered as a panacea that can overcome all limitations associated with earlier technologies, and it is unlikely that it will completely replace all earlier sequencing technologies. Rather, it should be seen as another new and promising technology; the actual benefits and limitations of will only become fully known after it has been used by a large number of researchers. It currently appears that, in future, SMS technologies will either be the dominant DNA sequencing methods or co-exist with other technologies.

Section snippets

‘State of the art’ of single-molecule sequencing

The amplification of the target DNA by polymerase chain reaction (PCR) is an integral step in all second-generation sequencing technologies. However, it creates several problems, including the introduction of a bias in template representation, and the introduction of errors during amplification. Another problem associated with second-generation sequencing technologies is ‘dephasing’ of the DNA strands due to loss of synchronicity in synthesis (i.e. different strands being sequenced in

Outstanding crucial issues

Despite these advantages, the SMS technologies share with other recent technologies some of the limitations and outstanding problems. These problems have been widely discussed, and they need to be addressed before any of these technologies can be extensively used in genomics research in a user-friendly and cost-effective manner.

Conclusions and perspectives

The demand for large-scale DNA sequencing has dramatically increased in recent years. As a result, we are witnessing the development of several mutually competitive DNA sequencing systems that are much faster and cheaper and which have a higher level of precision. These new systems involve ultra-high-throughput sequencing of a large number of DNA fragments in parallel and include the following three classes of sequencing systems: the first-generation systems, which are based on the old, Sanger

Acknowledgements

The Indian National Science Academy (INSA) awarded the position of INSA Honorary Scientist to P.K.G.; the Head of the Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut, India provided the facilities; Ajay Kumar helped in various ways during the preparation of this manuscript; Sachin Rustgi helped in improving the quality of the figures; and the Editor subjected the manuscript to several rounds of critical reading, which led to significant improvement of this

Glossary

Bioinformatics
the application of molecular biology as an information science, especially involving the use of computers in genomics research.
DNA resequencing
sequencing an individual's specific DNA segment, for which sequence information is already available from one or more other individuals.
DNA sequencing
this term encompasses biochemical methods for determining the order of the nucleotide bases, adenine, guanine, cytosine and thymine, in a DNA oligonucleotide.
Epigenome
a form of the genome that

References (65)

  • M. Morgante

    Transposable elements and the plant pan genome

    Curr. Opin. Plant Biol.

    (2007)
  • R. Lister

    Highly integrated single-base resolution maps of the epigenome in Arabidopsis

    Cell

    (2008)
  • M. Olson

    Enrichment of super-sized resequencing targets from the human genome

    Nat. Methods

    (2007)
  • S. Levy

    The diploid genome sequence of an individual human

    PloS Biol.

    (2007)
  • D.A. Wheeler

    The complete genome of an individual by massively parallel DNA sequencing

    Nature

    (2008)
  • J. Kaiser

    A plan to capture human diversity in 1000 genomes

    Science

    (2008)
  • G.M. Church

    Genomes for all

    Sci. Am.

    (2006)
  • J. Shendure

    Advanced sequencing technologies

    Nat. Rev. Genet.

    (2004)
  • M.L. Metzker

    Emerging technologies in DNA sequencing

    Genome Res.

    (2005)
  • P.K. Gupta

    Ultrafast and low cost sequencing methods for applied genomics research

    Proc. Natl. Acad. Sci. India

    (2008)
  • J.H. Jett

    High-speed DNA sequencing: an approach based upon fluorescence detection of single molecules

    J. Biomol. Struct. Dyn.

    (1989)
  • S. Brakmann

    A further step towards single-molecule sequencing: Escherichia coli exonuclease III degrades DNA that is fluorescently labeled at each base pair

    Angew. Chem. Int. Ed. Engl.

    (2002)
  • A. Crut

    Detection of single DNA molecules by multicolor quantum-dot end-labeling

    Nucleic Acids Res.

    (2005)
  • I. Braslavsky

    Sequence information can be obtained from single DNA molecules

    Proc. Natl. Acad. Sci. U. S. A.

    (2003)
  • T.D. Harris

    Single molecule DNA sequencing of a viral genome

    Science

    (2008)
  • M.J. Levene

    Zero-mode waveguides for single molecule analysis at high concentrations

    Science

    (2003)
  • J. Korlach

    Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures

    Proc. Natl. Acad. Sci. U. S. A.

    (2008)
  • J.J. Kasianowicz

    Characterization of individual polynucleotide molecules using a membrane channel

    Proc. Natl. Acad. Sci. U. S. A.

    (1996)
  • Karow, J. (2008) Moving from simulations to real data, short read assemblers. In Sequence, 2008 March 18, Vol. 2, Iss....
  • T.J. Albert

    Direct selection of human genomic loci by microarray hybridization

    Nat. Methods

    (2007)
  • D.T. Okou

    Microarray-based genomic selection for high-throughput resequencing

    Nat. Methods

    (2007)
  • Cited by (172)

    View all citing articles on Scopus
    View full text