Trends in Genetics
Plant conserved non-coding sequences and paralogue evolution
Introduction
Gene duplication is a major determinant of the size and gene complement of eukaryotic genomes. Perhaps the most spectacular method of gene duplication is whole-genome duplication via polyploidization, which has a major role in the evolutionary history of eukaryotes. For example, the yeast Saccharomyces cerevisiae contains numerous duplicated genes and chromosomal regions that are attributed to a polyploidy event ∼100 million years ago (mya) 1, 2. Similarly, the human genome retains vestiges of duplication events that occurred between 350 and 650 mya and that might be attributable to at least one polyploid event [3].
Genome duplication is particularly prominent in plants. Arabidopsis thaliana has experienced at least three ancient polyploid events 4, 5, 6; rice (Oryza sativa) contains duplicated chromosomal regions that are attributable either to ancient segmental duplications [7] or to a paleopolyploid event 8, 9; genetic maps of maize (Zea mays ssp. mays) contain evidence of several large-scale duplication events [10]. More recently, Blanc and Wolfe [11] used EST data to investigate the number and relative age of duplicated genes in 14 plant species. Nine of these species contained evidence of ancient large-scale duplication events, reflecting at least seven paleopolyploid events in the phylogenetic history of the sample. At least 16 polyploid events have been documented during the evolutionary history of a relatively small sample of angiosperm taxa (Figure 1).
The inescapable conclusion is that the organization and evolution of plant genomes have been shaped by many recent and ancient polyploid events. By contrast, vertebrates have probably undergone only one or two large-scale genome duplication events throughout their ∼500 million year history 3, 12, 13. With a few exceptions (such as amphibians [14]), extant polyploids are also rare. In mammals, for example, the only known polyploid is the tetraploid red viscacha rat of Argentina [15].
The relative frequency of genome duplication in plant and animal lineages affects their genome organization but might also have profound effects on gene function and regulation. Gene duplication has long been considered a crucial step in ‘freeing’ single-copy genes from selective constraint, enabling them to evolve new functions [12], but a pair of duplicated genes can also diverge in function as a result of changes in regulatory elements [16]. These observations raise a number of questions: What are the consequences of genome duplication for the regulatory complexity of plant genomes? Are differences in regulatory motifs – more specifically conserved non-coding sequences (CNSs) – evident between plant and animal genomes? How does genetic duplication affect regulatory elements, and what are the consequences for the evolution of promoter regions between paralogous genes? We address these questions in this article.
Section snippets
The complexity of plant and animal CNSs
CNSs are short stretches of non-coding DNA that have been preserved between species. Such conservation might be indicative of selective constraint and hence function. CNSs are found predominantly in upstream regulatory regions and are enriched for sequences that perform regulatory functions 17, 18. Indeed, CNSs have a functional role in gene expression [19], and it is thought that they are assembly points for large, multi-protein complexes that perform gene-regulatory functions [18]. In the
CNSs in plant paralogues
In addition to organismal complexity, gene duplication can also contribute to the substantial differences between CNSs in cereal and mammalian genomes. Gene duplication is an important driving force for generating evolutionary novelty. Ohno's classical model of gene duplication [12] states that a gene under tight functional constraint is ‘freed’ from selection pressures once a duplication event creates a redundant copy. This liberated gene copy has two potential fates: it can acquire a new
Fractionation: functional biases and the evolution of gene expression
Given the prevalence of genome duplication in plants, fractionation has a crucial role in shaping the functional complement of plant genomes. However, surprisingly little is known about the patterns and processes of fractionation. For example, some studies indicate that genome rearrangement occurs rapidly after polyploidization, suggesting that gene loss is initially rapid but eventually slows 35, 36. Nonetheless, we do not know how widely this applies or the long-term rates of gene loss in
Future directions
It is not an exaggeration to say that 100% of plants are either polyploid or have an evolutionary history of paleopolyploidy. However, the timing and phylogenetic placement of these events (Figure 1), will continue to be a focus of genetic research. The knowledge gained will have important practical applications. In the cereals, for example, there is substantial interest in isolating genes for agronomic traits from ‘small genome’ crops such as rice and extending knowledge to ‘large genome’
Acknowledgements
We are grateful for the comments of two anonymous reviewers. This work is supported by NSF grants DEB-0316157 and DBI-0321467 to B.S.G.
References (50)
- et al.
Polyploidy: recurrent formation and genome evolution
Trends Ecol. Evol.
(1999) Adaptive evolution and functional divergence of pepsin gene family
Gene
(2004)Evolution of cis-regulatory elements in duplicated genes of yeast
Trends Genet.
(2003)How much expression divergence after yeast gene duplication could be explained by regulatory motif evolution?
Trends Genet.
(2004)- et al.
Molecular evidence for an ancient duplication of the entire yeast genome
Nature
(1997) Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae
Nature
(2004)Extensive genomic duplication during early chordate evolution
Nat. Genet.
(2002)The origins of genomic duplications in Arabidopsis
Science
(2000)The hidden duplication past of Arabidopsis thaliana
Proc. Natl. Acad. Sci. U. S. A.
(2002)Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events
Nature
(2003)
Evidence that rice and other cereals are ancient aneuploids
Plant Cell
A draft sequence of the rice genome (Oryza sativa L. ssp. japonica)
Science
Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics
Proc. Natl. Acad. Sci. U. S. A.
LineUp: statistical detection of chromosomal homology with application to plant comparative genomics
Genome Res.
Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes
Plant Cell
Vertebrate genome evolution and the zebrafish gene map
Nat. Genet.
Discovery of tetraploidy in a mammal
Nature
Preservation of duplicate genes by complementary degenerative mutations
Genetics
Conserved noncoding sequences in the grasses
Genome Res.
Utility and distribution of conserved noncoding sequences in the grasses
Proc. Natl. Acad. Sci. U. S. A.
Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional
Genome Res.
rVista for comparative sequence-based discovery of functional transcription factor binding sites
Genome Res.
Active conservation of noncoding sequences revealed by three-way species comparisons
Genome Res.
Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment
Genome Res.
Cited by (128)
Systematic annotation of conservation states provides insights into regulatory regions in rice
2022, Journal of Genetics and GenomicsCitation Excerpt :As more and more plant genomes are available for comparative genomics analysis, we believe that more functional CNSs with different degrees of conservation can be detected in the rice genome. In this regard, we argue against the opinion about ‘plants have far fewer CNSs per gene than mammals, suggesting that plants have less complex regulatory mechanisms’ (Kaplinsky et al., 2002; Lockton and Gaut, 2005). Rice CNSs have a similar size distribution as maize CNSs (Song et al., 2021), although different genome organizations in these two species and different approaches were used in the studies.
Extracting phylogenetic signal from phylogenomic data: Higher-level relationships of the nightbirds (Strisores)
2019, Molecular Phylogenetics and EvolutionExploring and Exploiting Pan-genomics for Crop Improvement
2019, Molecular PlantInsights into the Evolution of Ohnologous Sequences and Their Epigenetic Marks Post-WGD in Malus Domestica
2023, Genome Biology and Evolution