Trends in Biotechnology
ReviewSingle-molecule DNA sequencing technologies for future genomics research
Introduction
Since the onset of genomics research in the mid-1990s, whole-genome sequencing has been undertaken for a large number of prokaryotes and eukaryotes. However, the initial enthusiasm and euphoria for whole-genome sequencing activity has now given way to a demand for large-scale whole-genome resequencing (see Glossary) or the sequencing of target regions and/or metagenomes and pan-genomes (see Glossary), which require an increased sequencing speed and reduced costs 1, 2. With this in mind, in 2004, the National Human Genome Research Institute of the National Institutes of Health (NIH–NHGRI) announced a total of US$70 million in grant awards for the development of DNA sequencing (see Glossary) technologies that would reduce the cost of sequencing the human genome from US$ 3 × 109, the amount spent on the public Human Genome Project, to US$103 by 2014 (www.genome.gov/12513210). In October 2006, the X Prize Foundation (Santa Monica, CA, USA) announced a US$10 million ‘Archon X Prize for Genomics’ to the first private effort that could sequence 100 human genomes in 10 days for less than US$10 000 per genome (http://genomics.xprize.org/genomics/archon-x-prize-for-genomics). These incentives have contributed to an explosion of research activity in the development of new DNA sequencing technologies.
Further impetus for developing new DNA sequencing technologies came from the emerging field of personal genomics (see Glossary), which aims to study variations in the genomes of individual humans. To date, only a few personal genomes have been fully sequenced, and the genome sequences of Craig Venter and James Watson have been the only ones published 3, 4. Nevertheless, initiatives are underway to further increase the freely available sequence data for a large number of individual human genomes, as proposed in the Personal Genome Project and other similar projects launched recently (Box 1) 5, 6.
To meet the increased sequencing demands, several non-Sanger ultra-high-throughput sequencing systems became commercially available in 2007 7, 8, 9, 10, 11, 12 (for non-Sanger sequencing systems, see Glossary). These were described as ‘second generation’ or ‘next generation’ sequencing systems, and included the following: Genome Sequencer 20/FLX (commercialized by 454/Roche); ‘Solexa 1G’ (later named ‘Genome Analyzer’ and commercialized by Illiumina/Solexa); SOLiD™ system (commercialized by Applied Biosystems); and Polonator G.007 (commercialized by Dover Systems). These developments have significantly reduced the cost of sequencing – from 1 cent for 10 bases to 1 cent for 1000 bases – and have simultaneously yielded an increase in DNA sequencing speed. As a result, several commercial genotyping services [e.g. Knome, DeCode, 23andMe and Navigenics (http://www.technologyreview.com/Biotech/20926/)] have also appeared on the market. These companies provide services to examine the genome of a person at as many as one million sites for US$1,000 to US$2,500. Future services might include the sequencing of whole genomes of individual humans, if there is a demand. However, there is also an attempt to regulate these gene tests, particularly in the states of New York and California, where gene test firms are being asked to obtain permits and conduct these tests only on the advice of a physician (http://tinyurl.com/55zzk8; http://tinyurl.com/5qgnr9).
Another non-Sanger DNA sequencing approach is the use of single-molecule sequencing (SMS), which only became available in 2008. This approach has been described as a ‘third generation’ or ‘next-next generation’ sequencing technology. It is anticipated that SMS will be much faster and cheaper, so that researchers in the near future should be able to pursue new scientific enquiries that are currently not possible owing to the prohibitive cost of sequencing. Whether or not SMS will fulfill this promise is debatable; compared with the earlier methods mentioned above, a significant reduction in sequencing costs has not yet been demonstrated for the only commercially available SMS technology, which is provided by Helicos Biosciences (Table 1).
Nevertheless, additional major efforts are underway to develop novel SMS technologies, which will be discussed below. In addition, much work is being done to address the outstanding issues that have become apparent during the development of these second- and third- generation sequencing technologies. Most of these outstanding issues (e.g. short read-lengths, higher error-rates, and the difficulty of managing massive amounts of data) are actually common to both second- and third-generation sequencing technologies, and bioinformatics (see Glossary) tools are being developed to deal with them. However, this article focuses on the third-generation SMS technologies, the so-called ‘next-next generation’. These systems will probably constitute a significant fraction of future genomics research efforts, mainly because they will significantly reduce the cost and effort of sequencing in comparison with second-generation technologies – despite the fact that cost reductions continue to be made for second-generation approaches. SMS technology should not necessarily be considered as a panacea that can overcome all limitations associated with earlier technologies, and it is unlikely that it will completely replace all earlier sequencing technologies. Rather, it should be seen as another new and promising technology; the actual benefits and limitations of will only become fully known after it has been used by a large number of researchers. It currently appears that, in future, SMS technologies will either be the dominant DNA sequencing methods or co-exist with other technologies.
Section snippets
‘State of the art’ of single-molecule sequencing
The amplification of the target DNA by polymerase chain reaction (PCR) is an integral step in all second-generation sequencing technologies. However, it creates several problems, including the introduction of a bias in template representation, and the introduction of errors during amplification. Another problem associated with second-generation sequencing technologies is ‘dephasing’ of the DNA strands due to loss of synchronicity in synthesis (i.e. different strands being sequenced in
Outstanding crucial issues
Despite these advantages, the SMS technologies share with other recent technologies some of the limitations and outstanding problems. These problems have been widely discussed, and they need to be addressed before any of these technologies can be extensively used in genomics research in a user-friendly and cost-effective manner.
Conclusions and perspectives
The demand for large-scale DNA sequencing has dramatically increased in recent years. As a result, we are witnessing the development of several mutually competitive DNA sequencing systems that are much faster and cheaper and which have a higher level of precision. These new systems involve ultra-high-throughput sequencing of a large number of DNA fragments in parallel and include the following three classes of sequencing systems: the first-generation systems, which are based on the old, Sanger
Acknowledgements
The Indian National Science Academy (INSA) awarded the position of INSA Honorary Scientist to P.K.G.; the Head of the Department of Genetics and Plant Breeding, Chaudhary Charan Singh University, Meerut, India provided the facilities; Ajay Kumar helped in various ways during the preparation of this manuscript; Sachin Rustgi helped in improving the quality of the figures; and the Editor subjected the manuscript to several rounds of critical reading, which led to significant improvement of this
Glossary
- Bioinformatics
- the application of molecular biology as an information science, especially involving the use of computers in genomics research.
- DNA resequencing
- sequencing an individual's specific DNA segment, for which sequence information is already available from one or more other individuals.
- DNA sequencing
- this term encompasses biochemical methods for determining the order of the nucleotide bases, adenine, guanine, cytosine and thymine, in a DNA oligonucleotide.
- Epigenome
- a form of the genome that
References (65)
Whole genome resequencing
Curr. Opin. Genet. Dev.
(2006)Advances in sequencing technology
Mutat. Res.
(2005)The impact of next generation sequencing technology on the genetics
Trends Genet.
(2008)Rapid DNA sequencing based upon single molecule detection
Genet. Anal. Tech. Appl.
(1991)- et al.
Single-molecule detection as an approach to rapid DNA sequencing
Trends Biotechnol.
(1992) Progress towards single-molecule DNA sequencing: a one color demonstration
J. Biotechnol.
(2003)Sequencing single molecules of DNA
Curr. Opin. Chem. Biol.
(2006)- et al.
Nanopore sequencing technology: research trends and applications
Trends Biotechnol.
(2006) Towards nanoscale genome sequencing
Trends Biotechnol.
(2007)- et al.
Bioinformatics challenges of new sequencing technology
Trends Genet.
(2008)
Transposable elements and the plant pan genome
Curr. Opin. Plant Biol.
Highly integrated single-base resolution maps of the epigenome in Arabidopsis
Cell
Enrichment of super-sized resequencing targets from the human genome
Nat. Methods
The diploid genome sequence of an individual human
PloS Biol.
The complete genome of an individual by massively parallel DNA sequencing
Nature
A plan to capture human diversity in 1000 genomes
Science
Genomes for all
Sci. Am.
Advanced sequencing technologies
Nat. Rev. Genet.
Emerging technologies in DNA sequencing
Genome Res.
Ultrafast and low cost sequencing methods for applied genomics research
Proc. Natl. Acad. Sci. India
High-speed DNA sequencing: an approach based upon fluorescence detection of single molecules
J. Biomol. Struct. Dyn.
A further step towards single-molecule sequencing: Escherichia coli exonuclease III degrades DNA that is fluorescently labeled at each base pair
Angew. Chem. Int. Ed. Engl.
Detection of single DNA molecules by multicolor quantum-dot end-labeling
Nucleic Acids Res.
Sequence information can be obtained from single DNA molecules
Proc. Natl. Acad. Sci. U. S. A.
Single molecule DNA sequencing of a viral genome
Science
Zero-mode waveguides for single molecule analysis at high concentrations
Science
Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures
Proc. Natl. Acad. Sci. U. S. A.
Characterization of individual polynucleotide molecules using a membrane channel
Proc. Natl. Acad. Sci. U. S. A.
Direct selection of human genomic loci by microarray hybridization
Nat. Methods
Microarray-based genomic selection for high-throughput resequencing
Nat. Methods
Cited by (172)
Somatic mutations – Evolution within the individual
2020, MethodsThe Landscapes of Gluten Regulatory Network in Elite Wheat Cultivars Contrasting in Gluten Strength
2023, International Journal of Molecular Sciences