Deep phylogenetic incongruence in the angiosperm clade Rosidae

https://doi.org/10.1016/j.ympev.2014.11.003Get rights and content

Highlights

  • Report phylogenetic conflict of COM in chloroplast, mitochondrial, and nuclear data.

  • Results of multi-gene and genomic data show strong evidence for deep incongruence.

  • We provide an example for examination of other deep nodes of the tree of life.

  • Genomic datasets highlight patterns of deep incongruence in angiosperm phylogeny.

  • Stress the complexity of angiosperm evolution, which may be masked by a few genes.

Abstract

Analysis of large data sets can help resolve difficult nodes in the tree of life and also reveal complex evolutionary histories. The placement of the Celastrales–Oxalidales–Malpighiales (COM) clade within Rosidae remains one of the most confounding phylogenetic questions in angiosperms, with previous analyses placing it with either Fabidae or Malvidae. To elucidate the position of COM, we assembled multi-gene matrices of chloroplast, mitochondrial, and nuclear sequences, as well as large single- and multi-copy nuclear gene data sets. Analyses of multi-gene data sets demonstrate conflict between the chloroplast and both nuclear and mitochondrial data sets, and the results are robust to various character-coding and data-exclusion treatments. Analyses of single- and multi-copy nuclear loci indicate that most loci support the placement of COM with Malvidae, fewer loci support COM with Fabidae, and almost no loci support COM outside a clade of Fabidae and Malvidae. Although incomplete lineage sorting and ancient introgressive hybridization remain as plausible explanations for the conflict among loci, more complete sampling is necessary to evaluate these hypotheses fully. Our results emphasize the importance of genomic data sets for revealing deep incongruence and complex patterns of evolution.

Introduction

Genome-scale data can provide the power to resolve some of the most perplexing parts of the tree of life (e.g., Dunn et al., 2008, Lee et al., 2011, Simon et al., 2012, Smith et al., 2011, Yoder et al., 2013). Furthermore, estimates from numerous independent loci also can reveal phylogenetic incongruence caused by different evolutionary processes, such as gene duplication and loss, recombination, hybridization, lateral gene transfer, or incomplete lineage sorting (e.g., Cui et al., 2013, Degnan and Rosenberg, 2009, Doyle, 1992, Goodman et al., 1979, Hudson, 1983, Maddison, 1997, Oliver, 2013). Molecular phylogenetic analyses have resolved much of the backbone angiosperm phylogeny (e.g., Ruhfel et al., 2014, Soltis et al., 2009, Soltis et al., 2011) and clarified long-standing questions regarding relationships within major clades such as monocots (Monocotyledoneae; Chase et al., 2000, Givnish et al., 2006, Givnish et al., 2010, Graham et al., 2006, Jerrold et al., 2004; Saarela et al., 2008; Saarela and Graham, 2010), asterids (Asteridae; Albach et al., 2001, Bremer et al., 2001, Bremer et al., 2004, Hilu et al., 2003, Moore et al., 2011, Olmstead et al., 2000), and rosids (Rosidae; Hilu et al., 2003, Jansen et al., 2007, Moore et al., 2010, Qiu et al., 2010, Soltis et al., 2007, Soltis et al., 2011, Wang et al., 2009). Yet much of this work is based either largely or exclusively on chloroplast sequence data, which represent a single, linked, and usually maternally inherited locus. New sequencing technologies make it feasible to obtain data sets of numerous independent nuclear loci, which can be used to evaluate results from analyses of chloroplast gene sequence data and reveal phylogenetic conflict among loci (e.g., Burleigh et al., 2011, Duarte et al., 2010, Lee et al., 2011, Xi et al., 2014, Zeng et al., 2014).

Introgressive hybridization has played an important role in plant evolution, and incomplete lineage sorting also likely occurred during some rapid radiations. Consequently, there are numerous examples of discordance between chloroplast and nuclear gene trees in plants (e.g., Acosta and Premoli, 2010, Okuyama et al., 2005, Rieseberg and Soltis, 1991; Rieseberg and Wendel, 1993; Rieseberg et al., 1995, Rieseberg et al., 1996, Soltis and Kuzoff, 1995, Soltis and Soltis, 2009, Tsitrone et al., 2003, Wendel et al., 1995, Xi et al., 2014). Although phylogenetic analyses of angiosperm backbone relationships based on nuclear, mitochondrial, and chloroplast loci have largely agreed, one major point of conflict is the placement of COM (Celastrales–Oxalidales–Malpighiales; Endress and Matthews, 2006, Zhu et al., 2007) within the large Rosidae clade.

Rosidae comprise approximately one quarter of all angiosperm species, which are morphologically diverse, exhibit extraordinary heterogeneity in habit, habitat, and life form, and include most temperate and tropical forest trees (Wang et al., 2009). Some members possess novel biochemical pathways (e.g., production of glucosinolate, and cyanogenic glycosides for defense), and many are important crops (e.g., Fabaceae and Rosaceae). Symbioses with nitrogen-fixing bacteria are largely confined to this clade as well. Resolving relationships within Rosidae has been difficult (e.g., Hilu et al., 2003, Jansen et al., 2007, Lee et al., 2011, Moore et al., 2010, Moore et al., 2011, Qiu et al., 2010, Ruhfel et al., 2014, Soltis et al., 2005, Soltis et al., 2007, Soltis et al., 2011, Wang et al., 2009, Zhu et al., 2007) due to a series of rapid radiations (Wang et al., 2009). However, multi-gene studies have recovered two major, well-supported clades—the Fabidae (i.e., eurosids I, fabids) and Malvidae (i.e., eurosids II, malvids) (Hilu et al., 2003, Judd and Olmstead, 2004, Moore et al., 2010, Moore et al., 2011, Soltis et al., 1999, Soltis et al., 2000, Soltis et al., 2005, Soltis et al., 2007, Soltis et al., 2011, Wang et al., 2009, Xi et al., 2014).

COM contains approximately one third of all Rosidae, 870 genera and ∼19,000 species (APG, 2009). Molecular analyses, largely dominated by chloroplast genes, have usually placed COM with Fabidae (Table 1; e.g., Burleigh et al., 2009, Hilu et al., 2003, Jansen et al., 2007, Moore et al., 2010, Moore et al., 2011, Soltis et al., 2005, Soltis et al., 2007, Soltis et al., 2011, Wang et al., 2009). Analyses of the mitochondrial gene matR first suggested the placement of COM with Malvidae (Zhu et al., 2007), and subsequent studies based on nuclear or mitochondrial genes supported this placement, although typically with limited taxon sampling (Table 1; Burleigh et al., 2011, Duarte et al., 2010, Finet et al., 2010, Lee et al., 2011, Morton, 2011, Qiu et al., 2010, Shulaev et al., 2010, Xi et al., 2014, Zhang et al., 2012). Several floral characters also appear to link COM with Malvidae. For example, in COM and Malvidae species, the inner integument of the ovule is thicker than the outer integument at the time of fertilization, a feature that is extremely rare in Fabidae and other eudicots. Additionally, contorted petals and a tendency towards polystemony and polycarpy also suggest a placement of COM members with Malvidae rather than with Fabidae (Endress and Matthews, 2006, Endress et al., 2013).

Although analyses of chloroplast gene sequence data generally appear to conflict with analyses of mitochondrial and nuclear gene sequence data, these studies often differ greatly in taxon sampling and analytical methods (Table 1; but see Xi et al., 2014). Thus, it is unclear whether the different placements of COM are due to errors in the analyses or biological incongruence among loci. The level of incongruence within the nuclear genome also is unknown. We use COM as an exemplar to investigate phylogenetic incongruence at deep levels in angiosperm phylogeny. Specifically, we first compare phylogenetic results from chloroplast, mitochondrial, and nuclear data sets having similar taxon sampling and examine whether the results are robust to various character-coding and data-exclusion protocols. We also survey large-scale nuclear data sets of both single-copy and multi-copy genes to investigate the patterns of phylogenetic discordance within the nuclear genome and then discuss whether these patterns are consistent with incomplete lineage sorting (i.e., deep coalescence) (Maddison, 1997, Maddison and Knowles, 2006, Page and Charleston, 1998) or ancient hybridization and introgression (Chang et al., 2011, Cui et al., 2013, Linder and Rieseberg, 2004, Tsitrone et al., 2003, Zhang et al., 2014).

Section snippets

Materials and methods

Throughout this paper, to facilitate discussion, we treat COM, Fabidae, and Malvidae as three separate groups, despite current classifications that consider COM to be part of Fabidae (APG, 2009, Cantino et al., 2007).

Chloroplast, mitochondrial, and nuclear data sets

ML analyses of the chloroplast, mitochondrial, and nuclear multi-gene alignments with similar taxon sampling recover different placements of COM (Fig. 1, Fig. 2, Fig. 3). We focus on the relationships among members of Rosidae, but all the trees generated in our analyses in the present study are available as supplemental data and on Dryad (http://dx.doi.org/10.5061/dryad.7sg58).

The phylogeny based on the 82-taxon, 78-gene chloroplast data set largely agrees with conclusions from previous

Conflict among multi-locus phylogenetic analyses

In spite of much recent progress resolving the angiosperm tree of life, the phylogenetic placement of COM remains uncertain. Most previous efforts to place COM have used a variety of data sources, taxon sampling strategies, and phylogenetic methods (but see Xi et al., 2014). Therefore, it is difficult to determine if the conflicting placements of COM are due to errors or actual biological conflict among loci (Table 1). Our ML analyses of multi-gene chloroplast, mitochondrial, and nuclear data

Concluding remarks

Numerous plant systematics studies have demonstrated the promise of genomic data to resolve angiosperm relationships that were not evident in analyses with a few genes (Burleigh et al., 2011, Finet et al., 2010, Lee et al., 2011, Moore et al., 2010, Moore et al., 2011, Zeng et al., 2014). We demonstrate here that analyses of data sets with many unlinked loci can highlight the ambiguity and discordance in phylogenetic relationships and potentially reveal the complexity of angiosperm evolution.

Conflict of interest

The authors declare no conflict of interest.

Acknowledgments

We thank Yin-Long Qiu, who contributed to the early design of this project, and Ning Zhang, who graciously provided us with the 92-taxon, 5-gene nuDNA alignment used in this study. This work was supported by the National Natural Science Foundation of China (NNSF 31270268), National Basic Research Program of China (No. 2014CB954101), Chinese Academy of Sciences Visiting Professorship for Senior International Scientists (grant number 2011T1S24), State Key Laboratory of Systematic and Evolutionary

References (111)

  • C.W. Birky

    Uniparental inheritance of mitochondrial and chloroplast genes: mechanisms and evolution

    Proc. Natl. Acad. Sci. USA

    (1995)
  • C.W. Birky

    The inheritance of genes in mitochondria and chloroplasts: laws, mechanisms, models

    Annu. Rev. Genet.

    (2001)
  • K. Bremer et al.

    A phylogenetic analysis of 100+ genera and 50+ families of euasterids based on morphological and molecular data with notes on possible higher level morphological synapomorphies

    Pl. Syst. Evol.

    (2001)
  • K. Bremer et al.

    Molecular phylogenetic dating of asterid flowering plants shows early Cretaceous diversification

    Syst. Biol.

    (2004)
  • T.R. Buckley et al.

    Differentiating between hypotheses of lineage sorting and introgression in New Zealand alpine cicadas (Maoricicada dugdale)

    Syst. Biol.

    (2006)
  • J.G. Burleigh et al.

    Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms

    BMC Evol. Biol.

    (2009)
  • J.G. Burleigh et al.

    Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees

    Syst. Biol.

    (2011)
  • P.D. Cantino et al.

    Towards a phylogenetic nomenclature of Tracheophyta

    Taxon

    (2007)
  • S.W. Chang et al.

    Ancient hybridization and underestimated species diversity in Asian striped squirrels (genus Tamiops): inference from paternal, maternal and biparental markers

    J. Zool.

    (2011)
  • M.W. Chase et al.

    Phylogenetics of seed plants: an analysis of nucleotide-sequences from the plastid gene rbcL

    Ann. Mo. Bot. Gard.

    (1993)
  • M.W. Chase et al.

    Higher-level systematics of the monocotyledons: an assessment of current knowledge and a new classification

  • F. Chen et al.

    OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups

    Nucleic Acids Res.

    (2006)
  • J.L. Corriveau et al.

    Rapid screening method to detect potential biparental inheritance of plastid DNA and results over 200 angiosperm species

    Am. J. Bot.

    (1988)
  • R. Cui et al.

    Phylogenomics reveals extensive reticulate evolution in Xiphophorus fishes

    Evolution

    (2013)
  • J.H. Degnan et al.

    Discordance of species trees with their most likely gene trees

    PLoS Genet.

    (2006)
  • F. Delsuc et al.

    Phylogenomics and the reconstruction of the tree of life

    Nat. Rev. Genet.

    (2005)
  • J.J. Doyle

    Gene trees and species trees – molecular systematics as one-character taxonomy

    Syst. Bot.

    (1992)
  • J.M. Duarte et al.

    Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels

    BMC Evol. Biol.

    (2010)
  • C.W. Dunn et al.

    Broad phylogenomic sampling improves resolution of the animal tree of life

    Nature

    (2008)
  • P.K. Endress et al.

    Floral structure and systematics in four orders of rosids, including a broad survey of floral mucilage cells

    Plant Syst. Evol.

    (2006)
  • P.K. Endress et al.

    Advances in the floral structural characterization of the major subclades of Malpighiales, one of the largest orders of flowering plants

    Ann. Bot.

    (2013)
  • S. Fauré et al.

    Maternal inheritance of chloroplast genome and paternal inheritance of mitochondrial genome in bananas (Musa acuminate)

    Curr. Genet.

    (1994)
  • A. Gibson et al.

    A comprehensive analysis of mammalian mitochondrial genome base composition and improved phylogenetic methods

    Mol. Biol. Evol.

    (2005)
  • T.J. Givnish et al.

    Phylogeny of the monocotyledons based on the highly informative plastid gene ndhF: evidence for widespread concerted convergence

  • T.J. Givnish et al.

    Assembling the tree of the monocotyledons: plastome sequence phylogeny and evolution of Poales

    Ann. Mo. Bot. Gard.

    (2010)
  • M. Goodman et al.

    Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed by globin sequences

    Syst. Zool.

    (1979)
  • P. Górecki et al.

    GTP supertrees from unrooted gene trees: linear time algorithms for NNI based local searches

  • V.V. Goremykin et al.

    Automated removal of noisy data in phylogenomic analyses

    J. Mol. Evol.

    (2010)
  • S.W. Graham et al.

    Robust inference of monocot deep phylogeny using an expanded multigene plastid data set

    Aliso

    (2006)
  • G.A. Harrison et al.

    Four new avian mitochondrial genomes help get to basic evolutionary questions in the late Cretaceous

    Mol. Biol. Evol.

    (2004)
  • T. Hashimoto et al.

    Phylogenetic place of mitochondrion-lacking protozoan, Giardia lamblia, inferred from amino acid sequences of elongation factor 2

    Mol. Biol. Evol.

    (1995)
  • M.J. Havey et al.

    Differential transmission of the Cucumis organellar genomes

    Theor. Appl. Genet.

    (1998)
  • K.W. Hilu et al.

    Angiosperm phylogeny based on matK sequence information

    Am. J. Bot.

    (2003)
  • M.T. Holder et al.

    Difficulties in detecting hybridization

    Syst. Biol.

    (2001)
  • B.R. Holland et al.

    Using supernetworks to distinguish hybridization from incomplete lineage sorting

    BMC Evol. Biol.

    (2008)
  • R.R. Hudson

    Testing the constant-rate neutral allele model with protein sequence data

    Evolution

    (1983)
  • D.H. Huson et al.

    Reconstruction of reticulate networks from gene trees

  • R.K. Jansen et al.

    Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns

    Proc. Natl. Acad. Sci. USA

    (2007)
  • I. Jerrold et al.

    A phylogeny of the monocots, as inferred from rbcL and atpA sequence variation, and a comparison of methods for calculating jackknife and bootstrap values

    Syst. Bot.

    (2004)
  • S. Joly et al.

    A statistical approach for distinguishing hybridization and incomplete lineage sorting

    Am. Nat.

    (2009)
  • Cited by (113)

    • Multiple outgroups can cause random rooting in phylogenomics

      2023, Molecular Phylogenetics and Evolution
    View all citing articles on Scopus
    View full text