Trends in Genetics
Volume 21, Issue 1, January 2005, Pages 60-65
Journal home page for Trends in Genetics

Plant conserved non-coding sequences and paralogue evolution

https://doi.org/10.1016/j.tig.2004.11.013Get rights and content

Genome duplication is a powerful evolutionary force and is arguably most prominent in plants, where several ancient whole-genome duplication events have been documented. Models of gene evolution predict that functional divergence between duplicates (subfunctionalization) is caused by the loss of regulatory elements. Studies of conserved non-coding sequences (CNSs), which are putative regulatory elements, indicate that plants have far fewer CNSs per gene than mammals, suggesting that plants have less complex regulatory mechanisms. Furthermore, a recent study of a duplicated gene pair in maize suggests that CNSs are lost in a complementary fashion, perhaps driving subfunctionalization. If subfunctionalization is common, one expects duplicate genes to diverge in expression; recent microarray analyses in Arabidopsis thalinia suggest that this is the case. Plant genomes are relatively complex on a genomic level because of the prevalence of whole-genome duplication and, paradoxically, subfunctionalization after duplication can lead to relatively simple regulatory regions on a per gene basis.

Introduction

Gene duplication is a major determinant of the size and gene complement of eukaryotic genomes. Perhaps the most spectacular method of gene duplication is whole-genome duplication via polyploidization, which has a major role in the evolutionary history of eukaryotes. For example, the yeast Saccharomyces cerevisiae contains numerous duplicated genes and chromosomal regions that are attributed to a polyploidy event ∼100 million years ago (mya) 1, 2. Similarly, the human genome retains vestiges of duplication events that occurred between 350 and 650 mya and that might be attributable to at least one polyploid event [3].

Genome duplication is particularly prominent in plants. Arabidopsis thaliana has experienced at least three ancient polyploid events 4, 5, 6; rice (Oryza sativa) contains duplicated chromosomal regions that are attributable either to ancient segmental duplications [7] or to a paleopolyploid event 8, 9; genetic maps of maize (Zea mays ssp. mays) contain evidence of several large-scale duplication events [10]. More recently, Blanc and Wolfe [11] used EST data to investigate the number and relative age of duplicated genes in 14 plant species. Nine of these species contained evidence of ancient large-scale duplication events, reflecting at least seven paleopolyploid events in the phylogenetic history of the sample. At least 16 polyploid events have been documented during the evolutionary history of a relatively small sample of angiosperm taxa (Figure 1).

The inescapable conclusion is that the organization and evolution of plant genomes have been shaped by many recent and ancient polyploid events. By contrast, vertebrates have probably undergone only one or two large-scale genome duplication events throughout their ∼500 million year history 3, 12, 13. With a few exceptions (such as amphibians [14]), extant polyploids are also rare. In mammals, for example, the only known polyploid is the tetraploid red viscacha rat of Argentina [15].

The relative frequency of genome duplication in plant and animal lineages affects their genome organization but might also have profound effects on gene function and regulation. Gene duplication has long been considered a crucial step in ‘freeing’ single-copy genes from selective constraint, enabling them to evolve new functions [12], but a pair of duplicated genes can also diverge in function as a result of changes in regulatory elements [16]. These observations raise a number of questions: What are the consequences of genome duplication for the regulatory complexity of plant genomes? Are differences in regulatory motifs – more specifically conserved non-coding sequences (CNSs) – evident between plant and animal genomes? How does genetic duplication affect regulatory elements, and what are the consequences for the evolution of promoter regions between paralogous genes? We address these questions in this article.

Section snippets

The complexity of plant and animal CNSs

CNSs are short stretches of non-coding DNA that have been preserved between species. Such conservation might be indicative of selective constraint and hence function. CNSs are found predominantly in upstream regulatory regions and are enriched for sequences that perform regulatory functions 17, 18. Indeed, CNSs have a functional role in gene expression [19], and it is thought that they are assembly points for large, multi-protein complexes that perform gene-regulatory functions [18]. In the

CNSs in plant paralogues

In addition to organismal complexity, gene duplication can also contribute to the substantial differences between CNSs in cereal and mammalian genomes. Gene duplication is an important driving force for generating evolutionary novelty. Ohno's classical model of gene duplication [12] states that a gene under tight functional constraint is ‘freed’ from selection pressures once a duplication event creates a redundant copy. This liberated gene copy has two potential fates: it can acquire a new

Fractionation: functional biases and the evolution of gene expression

Given the prevalence of genome duplication in plants, fractionation has a crucial role in shaping the functional complement of plant genomes. However, surprisingly little is known about the patterns and processes of fractionation. For example, some studies indicate that genome rearrangement occurs rapidly after polyploidization, suggesting that gene loss is initially rapid but eventually slows 35, 36. Nonetheless, we do not know how widely this applies or the long-term rates of gene loss in

Future directions

It is not an exaggeration to say that 100% of plants are either polyploid or have an evolutionary history of paleopolyploidy. However, the timing and phylogenetic placement of these events (Figure 1), will continue to be a focus of genetic research. The knowledge gained will have important practical applications. In the cereals, for example, there is substantial interest in isolating genes for agronomic traits from ‘small genome’ crops such as rice and extending knowledge to ‘large genome’

Acknowledgements

We are grateful for the comments of two anonymous reviewers. This work is supported by NSF grants DEB-0316157 and DBI-0321467 to B.S.G.

References (50)

  • K. Vandepoele

    Evidence that rice and other cereals are ancient aneuploids

    Plant Cell

    (2003)
  • S.A. Goff

    A draft sequence of the rice genome (Oryza sativa L. ssp. japonica)

    Science

    (2002)
  • A.H. Paterson

    Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics

    Proc. Natl. Acad. Sci. U. S. A.

    (2004)
  • S. Hampson

    LineUp: statistical detection of chromosomal homology with application to plant comparative genomics

    Genome Res.

    (2003)
  • G. Blanc et al.

    Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes

    Plant Cell

    (2004)
  • S. Ohno
    (1970)
  • J.H. Postlethwait

    Vertebrate genome evolution and the zebrafish gene map

    Nat. Genet.

    (1998)
  • M.H. Gallardo

    Discovery of tetraploidy in a mammal

    Nature

    (1999)
  • A. Force

    Preservation of duplicate genes by complementary degenerative mutations

    Genetics

    (1999)
  • D.C. Inada

    Conserved noncoding sequences in the grasses

    Genome Res.

    (2003)
  • N.J. Kaplinsky

    Utility and distribution of conserved noncoding sequences in the grasses

    Proc. Natl. Acad. Sci. U. S. A.

    (2002)
  • K.A. Frazer

    Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional

    Genome Res.

    (2004)
  • G.G. Loots

    rVista for comparative sequence-based discovery of functional transcription factor binding sites

    Genome Res.

    (2002)
  • I. Dubchak

    Active conservation of noncoding sequences revealed by three-way species comparisons

    Genome Res.

    (2000)
  • E.T. Dermitzakis

    Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment

    Genome Res.

    (2004)
  • Cited by (128)

    • Systematic annotation of conservation states provides insights into regulatory regions in rice

      2022, Journal of Genetics and Genomics
      Citation Excerpt :

      As more and more plant genomes are available for comparative genomics analysis, we believe that more functional CNSs with different degrees of conservation can be detected in the rice genome. In this regard, we argue against the opinion about ‘plants have far fewer CNSs per gene than mammals, suggesting that plants have less complex regulatory mechanisms’ (Kaplinsky et al., 2002; Lockton and Gaut, 2005). Rice CNSs have a similar size distribution as maize CNSs (Song et al., 2021), although different genome organizations in these two species and different approaches were used in the studies.

    View all citing articles on Scopus
    View full text