Abstract
A considerable and unanticipated plasticity of the human genome, manifested as inter-individual copy number variation, has been discovered. These structural changes constitute a major source of inter-individual genetic variation that could explain variable penetrance of inherited (Mendelian and polygenic) diseases and variation in the phenotypic expression of aneuploidies and sporadic traits, and might represent a major factor in the aetiology of complex, multifactorial traits. For these reasons, an effort should be made to discover all common and rare copy number variants (CNVs) in the human population. This will also enable systematic exploration of both SNPs and CNVs in association studies to identify the genomic contributors to the common disorders and complex traits.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Kan, Y. W. & Dozy, A. M. Polymorphism of DNA sequence adjacent to human β-globin structural gene: relationship to sickle mutation. Proc. Natl Acad. Sci. USA 75, 5631–5635 (1978).
Wyman, A. R. & White, R. A highly polymorphic locus in human DNA. Proc. Natl Acad. Sci. USA 77, 6754–6758 (1980).
Jeffreys, A. J., Wilson, V. & Thein, S. L. Hypervariable 'minisatellite' regions in human DNA. Nature 314, 67–73 (1985).
Jeffreys, A. J., Wilson, V. & Thein, S. L. Individual-specific 'fingerprints' of human DNA. Nature 316, 76–79 (1985).
Nakamura, Y. et al. Variable number of tandem repeat (VNTR) markers for human gene mapping. Science 235, 1616–1622 (1987).
Botstein, D., White, R. L., Skolnick, M. & Davis, R. W. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32, 314–331 (1980).
Litt, M. & Luty, J. A. A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am. J. Hum. Genet. 44, 397–401 (1989).
Weber, J. L. & May, P. E. Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Genet. 44, 388–396 (1989).
Tautz, D. Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res. 17, 6463–6471 (1989).
Smeets, H. J., Brunner, H. G., Ropers, H. H. & Wieringa, B. Use of variable simple sequence motifs as genetic markers: application to study of myotonic dystrophy. Hum. Genet. 83, 245–251 (1989).
Williamson, R. et al. Report of the DNA committee and catalogues of cloned and mapped genes and DNA polymorphisms. Cytogenet. Cell Genet. 55, 457–778 (1990).
Economou, E. P., Bergen, A. W., Warren, A. C. & Antonarakis, S. E. The polydeoxyadenylate tract of Alu repetitive elements is polymorphic in the human genome. Proc. Natl Acad. Sci. USA 87, 2951–2954 (1990).
Kashi, Y. et al. Large restriction fragments containing poly-TG are highly polymorphic in a variety of vertebrates. Nucleic Acids Res. 18, 1129–1132 (1990).
Beckmann, J. S. & Weber, J. L. Survey of human and rat microsatellites. Genomics 12, 627–631 (1992).
Weissenbach, J. et al. A second-generation linkage map of the human genome. Nature 359, 794–801 (1992).
Murray, J. C. et al. A comprehensive human linkage map with centimorgan density. Cooperative Human Linkage Center (CHLC). Science 265, 2049–2054 (1994).
Gyapay, G. et al. The 1993–94 Genethon human genetic linkage map. Nature Genet. 7, 246–339 (1994).
Dib, C. et al. A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380, 152–154 (1996).
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Dutt, A. & Beroukhim, R. Single nucleotide polymorphism array analysis of cancer. Curr. Opin. Oncol. 19, 43–49 (2007).
Varilo, T. & Peltonen, L. Isolates and their potential use in complex gene mapping efforts. Curr. Opin. Genet. Dev. 14, 316–323 (2004).
Weir, B. S., Anderson, A. D. & Hepler, A. B. Genetic relatedness analysis: modern data and new challenges. Nature Rev. Genet. 7, 771–780 (2006).
Engel, E. Uniparental disomy revisited: the first twelve years. Am. J. Med. Genet. 46, 670–674 (1993).
Antonarakis, S. E. Parental origin of the extra chromosome in trisomy 21 as indicated by analysis of DNA polymorphisms. Down Syndrome Collaborative Group. N. Engl. J. Med. 324, 872–876 (1991).
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Edwards, A. O. et al. Complement factor H polymorphism and age-related macular degeneration. Science 308, 421–424 (2005).
Hageman, G. S. et al. A common haplotype in the complement regulatory gene factor H (HF1/CFH) predisposes individuals to age-related macular degeneration. Proc. Natl Acad. Sci. USA 102, 7227–7232 (2005).
Haines, J. L. et al. Complement factor H variant increases the risk of age-related macular degeneration. Science 308, 419–421 (2005).
Klein, R. J. et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).
Bottini, N. et al. A functional variant of lymphoid tyrosine phosphatase is associated with type I diabetes. Nature Genet. 36, 337–338 (2004).
Smyth, D. J. et al. A genome-wide association study of nonsynonymous SNPs identifies a type 1 diabetes locus in the interferon-induced helicase (IFIH1) region. Nature Genet. 38, 617–619 (2006).
Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genet. 38, 320–323 (2006).
Sladek, R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881–885 (2007).
Scott, L. J. et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316, 1341–1345 (2007).
Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007).
Saxena, R. et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331–1336 (2007).
Todd, J. A. et al. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nature Genet. 53, 1884–1889 (2007).
Frayling, T. M. et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889–894 (2007).
Dina, C. et al. Variation in FTO contributes to childhood obesity and severe adult obesity. Nature Genet. 39, 724–726 (2007).
Helgadottir, A. et al. A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 316, 1491–1493 (2007).
McPherson, R. et al. A common allele on chromosome 9 associated with coronary heart disease. Science 316, 1488–1491 (2007).
Gudmundsson, J. et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature Genet. 39, 631–637 (2007).
Camp, N. J. et al. Compelling evidence for a prostate cancer gene at 22q12. 3 by the International Consortium for Prostate Cancer Genetics. Hum. Mol. Genet. 16, 1271–1278 (2007).
Hunter, D. J. et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature Genet. 39, 870–874 (2007).
Easton, D. F. et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087–1093 (2007).
Stacey, S. N. et al. Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nature Genet. 39, 865–869 (2007).
Goossens, M. et al. Triplicated α-globin loci in humans. Proc. Natl Acad. Sci. USA 77, 518–521 (1980).
Kan, Y. W. et al. Deletion of α-globin genes in haemoglobin-H disease demonstrates multiple α-globin structural loci. Nature 255, 255–256 (1975).
Vollrath, D., Nathans, J. & Davis, R. W. Tandem array of human visual pigment genes at Xq28. Science 240, 1669–1672 (1988).
Drummond-Borg, M., Deeb, S. S. & Motulsky, A. G. Molecular patterns of X chromosome-linked color vision genes among 134 men of European ancestry. Proc. Natl Acad. Sci. USA 86, 983–987 (1989).
Wagner, F. F. & Flegel, W. A. RHD gene deletion occurred in the Rhesus box. Blood 95, 3662–3668 (2000).
Ji, Y., Eichler, E. E., Schwartz, S. & Nicholls, R. D. Structure of chromosomal duplicons and their role in mediating human genomic disorders. Genome Res. 10, 597–610 (2000).
Lupski, J. R. et al. DNA duplication associated with Charcot–Marie–Tooth disease type 1A. Cell 66, 219–232 (1991).
Lee, J. A. & Lupski, J. R. Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders. Neuron 52, 103–121 (2006).
Iafrate, A. J. et al. Detection of large-scale variation in the human genome. Nature Genet. 36, 949–951 (2004).
Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004).
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
Wong, K. K. et al. A comprehensive analysis of common copy-number variations in the human genome. Am. J. Hum. Genet. 80, 91–104 (2007).
Eichler, E. E. et al. Completing the map of human genetic variation. Nature 447, 161–165 (2007).
Cho, E. K. et al. Array-based comparative genomic hybridization and copy number variation in cancer research. Cytogenet. Genome Res. 115, 262–272 (2006).
Freeman, J. L. et al. Copy number variation: new insights in genome diversity. Genome Res. 16, 949–961 (2006).
Khaja, R. et al. Genome assembly comparison identifies structural variants in the human genome. Nature Genet. 38, 1413–1418 (2006).
Feuk, L., Carson, A. R. & Scherer, S. W. Structural variation in the human genome. Nature Rev. Genet. 7, 85–97 (2006).
Aitman, T. J. et al. Copy number polymorphism in FCGR3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855 (2006).
Antonarakis, S. E. & Beckmann, J. S. Mendelian disorders deserve more attention. Nature Rev. Genet. 7, 277–282 (2006).
Feuk, L., Marshall, C. R., Wintle, R. F. & Scherer, S. W. Structural variants: changing the landscape of chromosomes and design of disease studies. Hum. Mol. Genet. 15, R57–R66 (2006).
Vissers, L. E., Veltman, J. A., van Kessel, A. G. & Brunner, H. G. Identification of disease genes by whole genome CGH arrays. Hum. Mol. Genet. 14, R215–R223 (2005).
Stranger, B. E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).
Thakkinstian, A. et al. Systematic review and meta-analysis of the association between complement factor H Y402H polymorphisms and age-related macular degeneration. Hum. Mol. Genet. 15, 2784–2790 (2006).
Hughes, A. E. et al. A common CFH haplotype, with deletion of CFHR1 and CFHR3, is associated with lower risk of age-related macular degeneration. Nature Genet. 38, 1173–1177 (2006).
Fremeaux-Bacchi, V. et al. The development of atypical haemolytic–uraemic syndrome is influenced by susceptibility factors in factor H and membrane cofactor protein: evidence from two independent cohorts. J. Med. Genet. 42, 852–856 (2005).
Antonarakis, S. E., Lyle, R., Dermitzakis, E. T., Reymond, A. & Deutsch, S. Chromosome 21 and Down syndrome: from genomics to pathophysiology. Nature Rev. Genet. 5, 725–738 (2004).
Boerkoel, C. F., Inoue, K., Reiter, L. T., Warner, L. E. & Lupski, J. R. Molecular mechanisms for CMT1A duplication and HNPP deletion. Ann. NY Acad. Sci. 883, 22–35 (1999).
Singleton, A. B. et al. α-Synuclein locus triplication causes Parkinson's disease. Science 302, 841 (2003).
Le Marechal, C. et al. Hereditary pancreatitis caused by triplication of the trypsinogen locus. Nature Genet. 38, 1372–1374 (2006).
Rovelet-Lecrux, A. et al. APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nature Genet. 38, 24–26 (2006).
Knight, J. C. Regulatory polymorphisms underlying complex disease traits. J. Mol. Med. 83, 97–109 (2005).
Stranger, B. E. & Dermitzakis, E. T. From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation. Hum. Genomics 2, 383–390 (2006).
Gonzalez, E. et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307, 1434–1440 (2005).
Fanciulli, M. et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nature Genet. 39, 721–723 (2007).
Yang, Y. et al. Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am. J. Hum. Genet. 80, 1037–1054 (2007).
Fellermann, K. et al. A chromosome 8 gene-cluster polymorphism with low human β-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am. J. Hum. Genet. 79, 439–448 (2006).
Szatmari, P. et al. Mapping autism risk loci using genetic linkage and chromosomal rearrangements. Nature Genet. 39, 319–328 (2007).
Ouahchi, K., Lindeman, N. & Lee, C. Copy number variants and pharmacogenomics. Pharmacogenomics 7, 25–29 (2006).
Estivill, X. et al. Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome. Hum. Mol. Genet. 11, 1987–1995 (2002).
Inoue, K. & Lupski, J. R. Molecular mechanisms for genomic disorders. Annu. Rev. Genomics Hum. Genet. 3, 199–242 (2002).
Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).
Schouten, J. P. et al. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 30, e57 (2002).
Armour, J. A., Sismani, C., Patsalis, P. C. & Cross, G. Measurement of locus copy number by hybridisation with amplifiable probes. Nucleic Acids Res. 28, 605–609 (2000).
Sellner, L. N. & Taylor, G. R. MLPA and MAPH: new techniques for detection of gene deletions. Hum. Mutat. 23, 413–419 (2004).
Saugier-Veber, P. et al. Simple detection of genomic microdeletions and microduplications using QMPSF in patients with idiopathic mental retardation. Eur. J. Hum. Genet. 14, 1009–1017 (2006).
McCarroll, S. A. et al. Common deletion polymorphisms in the human genome. Nature Genet. 38, 86–92 (2006).
Conrad, D. F. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genet. 38, 1251–1260 (2006).
Lupski, J. R. Genomic rearrangements and sporadic disease. Nature Genet. 39, S43–S47 (2007).
Acknowledgements
We thank all the past and present members of our laboratories and clinics for ideas, debates and discussions. We thank J. Lupski for critical reading of the manuscript. Work in J.S.B.'s laboratory is funded by grants from the SNF (Swiss National Science Foundation) and the University of Lausanne, Switzerland. The laboratory of X.E. is supported by: the Departament d'Educació i Universitats and the Departament de Salut of the Catalan Autonomous Government (Generalitat de Catalunya); the Ministry of Health and the Ministry of Education and Science of the Spanish Government; the European Union Sixth Framework Programme; and Genoma España. S.E.A.'s laboratory is supported by the SNF, European Union, US National Institutes of Health and the Lejeune and ChildCare Foundations (France).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Related links
Related links
DATABASES
OMIM
atypical haemolytic–uraemic syndrome
hereditary neuropathy with liability to pressure palsies
FURTHER INFORMATION
Glossary
- Aneuploidy
-
Having more or less than the typical chromosome number (46 for humans).
- Array comparative genomic hybridization
-
A technology in which sampled and reference DNA are differentially labelled and hybridized on BAC or oligonucleotide microarrays to show copy number differences between the sampled genomes.
- Association study
-
A population-based genetic study that examines whether a marker allele segregates with a phenotype (such as disease occurrence or a quantitative trait) at a significantly higher rate than would be predicted by chance alone. This is ascertained by genotyping variants in both affected and unaffected or control individuals.
- BAC
-
A DNA construct, derived from a fertility plasmid (or F-plasmid), which usually carries an insert of 100–300 kb. Complete genomic libraries cloned in BACs (or PACs, which are produced from P1-plasmids) have been useful in constructing arrays for array comparative genomic hybridization experiments.
- Copy number variant or polymorphism
-
A structural genomic variant that results in confined copy number changes in a specific chromosomal region. If its population allele frequency is less than 1%, it is referred to as a variant; if its frequency exceeds 1%, the term polymorphism is used.
- Duplicon
-
A duplication, or portion thereof, of genomic sequence that shows a high level of sequence identity (over 90%) to another region of a reference genome. Also sometimes referred to as a low copy repeat or segmental duplication.
- Genomic disorder
-
A disorder that results from the gain, loss or re-orientation of a genomic region that often contains dosage-sensitive gene(s). The result is a genomic rearrangement (such as duplication, deletion and inversion). Segmental duplications are often involved in the rearrangement event through non-allelic homologous recombination.
- Haplotype Block
-
A chromosomal region in which groups of alleles at different genetic loci are inherited together more often than would be expected by chance. Adjacent blocks are separated by recombination hotspots (short regions with high recombination rates).
- Hardy–Weinberg equilibrium
-
The binomial distribution of genotypes in a population, such that frequencies of genotypes AA, Aa and aa will be p2, 2pq and q2, respectively, where p is the frequency of allele A, and q is the frequency of allele a.
- High-resolution tiling path CGH arrays
-
Arrays for comparative genomic hybridization (CGH) that offer a resolution in the order of bases to kilobases. The arrays currently use BACs or long oligonucleotides.
- Hypomorphic
-
Describes an allele that carries a mutation that causes a partial loss of gene function.
- Linkage disequilibrium
-
A measure of whether alleles at two loci coexist within gametes in a population in a nonrandom fashion. Alleles that are in linkage disequilibrium are found together on the same haplotype more often than would be expected by chance.
- Microsatellite
-
A class of repetitive DNA sequences, scattered throughout the genome, that are made up of tandemly organized repeats of 2–8 nucleotides in length. They can be highly polymorphic and are frequently used as molecular markers in population genetics studies.
- Minisatellites
-
Regions of DNA in which repeat units of 7–100 bp are arranged in tandem arrays of 0.5–30 kb long.
- Multiplex amplification and probe hybridization
-
A technique in which, following hybridization to immobilized samples of nucleic acid sequences, amplification of each oligonucleotide probe yields a product of unique size. The copy number of target sequences is reflected in the relative intensities of the amplification products.
- Multiplex ligation-dependent probe amplification
-
A technique involving the ligation of two adjacent annealing oligonucleotides followed by quantitative PCR amplification of the ligated products, allowing the detection of deletions, duplications and trisomies, and characterization of chromosomal aberrations in copy number or sequence and SNP or mutation detection.
- Non-allelic homologous recombination
-
Recombination between non-allelic paralogous segmental duplications (also known as low copy repeats); a major mechanism leading to deletions, duplications and inversions, as well as complex structural polymorphism and rearrangements in the human genome.
- Paralogous sequence variants
-
Genetic changes that are not due to polymorphism but to nucleotide mismatches from paralogous copies of duplicated sequences of the genome. About 20% of the SNPs deposited in databases are not true SNPs but paralogous sequence variants.
- Penetrance
-
The extent to which a given genotype manifests itself in a given phenotype. The penetrance of some genotypes for some diseases is age-related, complicating the determination of true penetrance.
- Quantitative multiplex PCR of short fluorescent fragments
-
Semi-quantitative, high-throughput analysis of targeted genomic alterations using locus-specific primers.
- Restriction fragment length polymorphism
-
A fragment length variant of a DNA sequence that is generated through the gain or loss of a site for a restriction enzyme.
- Tagging SNPs
-
SNPs that are correlated with and therefore can serve as proxies for a set of variants with which they are in linkage disequilibrium.
- Ultra high-throughput sequencing
-
A compendium of new sequencing technologies with a common aim to accelerate (from years to days or hours) and reduce the cost (from millions to thousands or hundreds of dollars) of sequencing.
- Uniparental disomy
-
A state wherein both homologues (alleles) at a locus derive from the same parent. Uniparental disomy of some chromosomal segments generates characteristic syndromes.
- Variable number of tandem repeat locus
-
A locus that contains a variable number of short tandemly repeated DNA sequences that vary in length and are highly polymorphic.
- Whole-genome tiling array
-
A high-density oligonucleotide array that represents the majority of DNA sequences of an organism's genome.
Rights and permissions
About this article
Cite this article
Beckmann, J., Estivill, X. & Antonarakis, S. Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nat Rev Genet 8, 639–646 (2007). https://doi.org/10.1038/nrg2149
Issue Date:
DOI: https://doi.org/10.1038/nrg2149
This article is cited by
-
Copy number variation in the CES1 gene and the risk of non-alcoholic fatty liver in a Chinese Han population
Scientific Reports (2021)
-
DeviCNV: detection and visualization of exon-level copy number variants in targeted next-generation sequencing data
BMC Bioinformatics (2018)
-
Copy number variations in Friesian horses and genetic risk factors for insect bite hypersensitivity
BMC Genetics (2018)
-
Epigenetic variation between urban and rural populations of Darwin’s finches
BMC Evolutionary Biology (2017)
-
High mutation rates explain low population genetic divergence at copy-number-variable loci in Homo sapiens
Scientific Reports (2017)