Elsevier

Mitochondrion

Volume 17, July 2014, Pages 56-66
Mitochondrion

GC skew and mitochondrial origins of replication

https://doi.org/10.1016/j.mito.2014.05.009Get rights and content

Highlights

  • A method for detecting replication origins in mitochondrial genomes

  • The method has been applied to available chordate mitogenomes.

  • Predicted positions are in agreement with experimentally verified replication origins.

  • Results indicate largely preserved positions of the replication origins.

Abstract

The comprehensive understanding of mitochondrial genome evolution requires a detailed mechanistic picture of mitogenomic replication. Despite many previous efforts it has remained a non-trivial problem to determine the origins of replication and trace their fate across rearrangements of the gene order even in the small genomes of animal mitochondria. We elaborate here on the observation that the GC skew is correlated with the distance from the replication origins. This effect has been explained as a consequence of the standard model of mitochondrial DNA replication, i.e. the strand displacement model. According to this model chemical damage accumulates proportional to the duration that DNA is exposed in single-stranded form during replication (Dssh) which depends on the relative position with respect to the replication origins. Based on this model we developed a computational method to infer the positions of both the heavy strand and the light strand origin from nucleotide skew data. In a comprehensive survey of deuterostome mitochondria we infer conserved replication origins for the vast majority of vertebrates and cephalochordates. Deviations from the consensus picture are presumably associated with genome rearrangements.

Introduction

Mitochondria are organelles that are virtually ubiquitous in eukaryotic cells. It is assumed that they originated from a proteobacterial ancestor through an endosymbiotic event (Gray and Archibald, 2012). In most eukaryotic lineages they have retained their own small genome, see Bernt et al. (2013b) for a recent review. Metazoan mitochondrial DNA (mtDNA) is (with few exceptions) a circular molecule about 16.5 kb in length. It encodes for 13 proteins of the mitochondrial respiratory chain, 2 ribosomal rRNAs, and a set of 22 tRNAs that are sufficient for the translation of mitogenomically encoded proteins (Boore, 1999). A remarkable feature of chordate mitochondrial genomes is the distinction between the two strands: the “heavy” H-strand is G-rich and the “light” L-strand is G-poor. The L-strand harbors most of the genes and hence is sometimes referred to as the major coding strand.

Due to their small size, metazoan mitogenomes have been sequenced in large numbers and thus provide an important source of phylogenetic information with a still unmatched taxon coverage (Bernt et al., 2012a, Boore, 1999). Mitochondrial gene orders are subject to several types of genome rearrangement: inversion and transposition (Boore and Brown, 1998), inverse transposition which includes an inversion of the transposed part (Boore, 1999), and tandem duplication random loss which consists of a duplication of a continuous segment of genes such that the original segment and its copy are consecutive, followed by random loss of one copy of each of the redundant genes (Bernt and Middendorf, 2011, Boore, 2000, Podsiadlowski et al., 2007).

In chordate mitogenomes, the large genes (protein coding and rRNA) are distributed unevenly between the two strands with 12 of 13 proteins and both rRNAs transcribed from the H-strand (i.e. annotated on the L-strand). Only the tRNAs are more evenly distributed, with 14 of 22 produced from H-strand transcripts (Clayton, 2000), see Fig. 1.

Chordate mitogenomes, with the exception of the highly rearranged mitogenomes of tunicates (Gissi et al., 2010), exhibit a generally constant order of the large genes that has been disturbed only by a few phylogenetic events of which most are transpositions. The typical chordate mitochondrial gene order is assumed to be the ancestral state of deuterostomes (Bernt et al., 2013a, Lavrov and Lang, 2005). Within birds and reptiles mitogenomes have been subject to rearrangements more frequently (Boore, 1999). An example is the transposition of nad6 and cob in Rhineura floridana (Lepidosauria). Rearrangements of tRNAs are observed more frequently (Kumazawa and Nishida, 1995, Pääbo et al., 1991). In contrast to chordates, in the mitogenomes of hemichordates and echinoderms several of the large genes have been subject to rearrangements (Perseke et al., 2008, Perseke et al., 2011).

The often large excess in the number of guanine residues in the H-strand compared to the L-strand violates the 2nd parity rule (Sueoka, 1995) which stipulates an expected intra-strand equilibrium that satisfies A = T and G = C (Reyes et al., 1998). This asymmetry in the nucleotide distribution is traditionally quantified in terms of the AT skew (A  T)/(A + T) and GC skew (G  C)/(G + C) (Perna and Kocher, 1995). Many studies suggested that the differences in the nucleotide frequencies are associated with replication (Clayton, 1982, Reyes et al., 1998, Tanaka and Ozawa, 1994) which places the H-strand in a single-stranded state exposing it to an asymmetric mutation process: (i) hydrolytic deamination of cytosine, (ii) hydrolytic deamination of adenine, and (iii) oxidation of guanine (Reyes et al., 1998, Tanaka and Ozawa, 1994). During the time that the H-strand is in the single stranded state, it is only incompletely protected which exposes it to hydrolytic DNA damage until paired by the light strand replication (Tanaka and Ozawa, 1994). The extent of mutations depends on the time which the H-strand is in a single stranded state.

Also transcription creates a transient single-stranded state of the non-transcribed DNA strand exposing it to DNA damage while pointing repair enzymes to the transcribed strand (Francino and Ochman, 1997). Since mtDNA is transcribed in the form of large polycistronic transcripts, one would expect an approximately constant mutation rate along the genome. This is inconsistent, however, with the observed variation of the GC skew along the mitogenome (Reyes et al., 1998). Additionally the sign of the observed skews is opposite of the skews that are expected to be created by transcription (Reyes et al., 1998). Altogether, replication might be considered as the major source of GC skew variation in animal mitogenomes.

The standard model for the replication of animal mitogenomes is the so-called strand displacement model (SDM) (Clayton, 1991), see Fig. 2. According to this model mitogenomes replicate asymmetrically with each strand being synthesized starting from its own specific position in the genome, see Fig. 2. Replication is initiated in the control region. The replication fork starts at the origin of the H-strand replication (OH) located within a D-loop structure (produced by the nascent H-strand displacing the parental H-strand). The synthesis of the H-strand is started by the elongation of a proper RNA primer by DNA Polymerase γ. The H-strand replication continues unidirectionally towards OL leaving the parental H-strand in a single stranded state. In mammalian mitogenomes the origin of the light strand replication OL is assumed to be located about 11 kb downstream of OH and to be composed of a short (about 30 nt) non-coding region which is thought to transform to a stem loop structure after the arrival of OH which initiates the L-strand replication in the opposite direction (Clayton, 1982). During this process, which takes about 2 h (Clayton, 1982), positions on the H-strand remain single stranded for a time span depending on their location with respect to OH and OL. Incomplete protection is due to single stranded mtDNA binding proteins.

A refinement of the SDM assuming “RNA incorporation throughout the lagging strand” (RITOLS) was proposed by Yasukawa et al. (2006). The RITOLS model shares the assumption of two distinct replication origins and the same asynchronous replication process with the SDM, but in contrast RNA is incorporated on the single stranded H-strand. This was recently supported by Reyes et al. (2013) where processed transcripts were identified as the RNA which is incorporated onto the single stranded H-strand via a “bootlace mechanism” leaving only short uncovered gaps. The lagging strand RNA is either converted to (Reyes et al., 2013) or replaced by (Holt and Reyes, 2012) DNA during L-strand replication. The RITOLS model apparently rules out the possibility of single stranded H-strand. In particular the time that a position on the H-strand spends single stranded should not depend on its position relative to the replication origins.

A strand coupled bidirectional replication process was suggested by Holt et al. (2000) where replication of both strands starts synchronously in a broad region downstream of OH. Bowmaker et al. (2003) refined this model by showing that replication starts bidirectional but one direction is stopped when the replication fork reaches OH. In this replication model neither the existence of OL nor longer periods of single stranded state are assumed. Strand coupled replication occurs only when cells are recovering from ethidium bromide-induced mtDNA depletion (Holt and Reyes, 2012, McKinney and Oliveira, 2013).

Reyes et al. (1998) reported a negative correlation between the GC skew (measured on the L-strand) and the time that the H-strand spends single stranded during replication according to the SDM for mammalian mitogenomes. This was explained as a consequence of the SDM in conjunction with the asymmetric mutation process. The reasoning is that because of the asynchrony of the replication a position on the H-strand (L-strand) that is reached first by the H-strand (L-strand) replication and then by the L-strand (H-strand) replication is single stranded between these two events. Hence, which of the two strands is single stranded and for how long it is single stranded depends on the position relative to the origins. The asymmetric mutation process leads to a decrement of C and increment of G, i.e. an increased GC skew, on the strand being single stranded (Reyes et al., 1998).

The coupling of the nucleotide skews with replication suggests that it should be possible to estimate the position of the replication origins directly from position-wise nucleotide frequency data. The linear relation was used previously by Seligmann et al. (2006) to investigate the possible role of tRNAs as light-strand origins in primate mitogenomes. They computed regression of C and T abundances as a function of the time that the H-strand spends single stranded when computed for alternative OLs and found significant correlations of the C/T quotient with the time spent single stranded for many tRNA gene clusters. They concluded that tRNA genes have properties similar to those of OL that allow them to function as alternative OL.

Previous studies tried to map the replication origins using nucleotide distributions primarily for genomes following a θ-replication scheme, e.g. prokaryotic genomes. Salzberg et al. (1998) used “skewed” octamers, which are over-represented on the leading strand, for finding replication origins. Picardeau et al. (2000) and Arakawa et al. (2007) used the switch in polarity that can be observed in cumulative GC or AT skew diagrams as an indicator for a replication origin. Arakawa et al. (2007) used a noise-reduction approach based on a fast Fourier transform of the GC skew signal to increase the prediction accuracy. This method is not applicable to mitogenomes because the cumulative skew diagrams do not show a V-shape (Grigoriev, 1998), which is presumably due to a non θ-type replication mechanism.

The subsequent parts of this paper are organized as follows. The linear regression based approach to determine the replication origins that is proposed in this paper is described in Section 2. The findings obtained from the application of the approach to the chordate mitogenomes are described in Section 3 and critically discussed in Section 4.

Section snippets

Data set

We used the 2074 complete deuterostome mitochondrial genomes from RefSeq release 58 (Pruitt et al., 2004). Sequences of species that have the same gene order (for protein coding genes and rRNAs) have been grouped based on the NCBI taxonomy (Sayers et al., 2009). The reconstruction of the rearrangements on the phylogeny given by the NCBI taxonomy has been conducted using TreeREx (Bernt et al., 2008). Groups with insufficient alignment quality have been subdivided to the order level to create

Results and discussion

The linear regression based approach for the detection of the replication origins has been applied to all 58 data sets that include all available mitogenomes of Vertebrata and basal deuterostome/chordate clades with moderately rearranged mitogenomes. To the best of our knowledge, experimental data on the positions of the replication origins have been published for only five key species (Supplementary File 1). RefSeq annotations, furthermore, usually do not specify the position of OH, but

Concluding remarks

We have presented here a new method to locate the origins of replication in metazoan mitochondrial genomes. An implementation of the method is available. The method makes use of the linear relation between the time spent single stranded and the distribution of the nucleotide skews along the genome. Two alternatives for computing a measure for the time spent single stranded are discussed. A survey of the mitogenomes of chordates and some basal deuterostomes with only moderately rearranged gene

Acknowledgment

AHS was supported by a grant for the Lebanese University from the AZM and Saade Association. We thank the anonymous reviewers for their helpful comments.

References (67)

  • I.J. Holt et al.

    Coupled leading- and lagging-strand synthesis of mammalian mitochondrial DNA

    J. Biol. Chem.

    (2000)
  • T.C. King et al.

    Mapping of control elements in the displacement loop region of bovine mitochondrial DNA

    J. Biol. Chem.

    (1987)
  • P.A. Martens et al.

    Mechanism of mitochondrial DNA replication in mouse L-cells: localization and sequence of the light-strand origin of replication

    J. Mol. Biol.

    (1979)
  • M. Miya et al.

    Major patterns of higher teleostean phylogenies: a new perspective based on 100 complete mitochondrial DNA sequences

    Mol. Phylogenet. Evol.

    (2003)
  • D.J. Oh et al.

    Complete mitochondrial genome of the rock bream Oplegnathus fasciatus (Perciformes, Oplegnathidae) with phylogenetic considerations

    Gene

    (2007)
  • M. Perseke et al.

    Evolution of mitochondrial gene orders in echinoderms

    Mol. Phylogenet. Evol.

    (2008)
  • L. Podsiadlowski et al.

    The complete mitochondrial genome of Scutigerella causeyae (Myriapoda: Symphyla) and the phylogenetic position of symphyla

    Mol. Phylogenet. Evol.

    (2007)
  • S.L. Salzberg et al.

    Skewed oligomers and origins of replication

    Gene

    (1998)
  • E. Sbisà et al.

    Mammalian mitochondrial D-loop region structural analysis: identification of new conserved sequences and their functional and evolutionary implications

    Gene

    (1997)
  • H. Seligmann et al.

    Possible multiple origins of replication in primate mitochondria: alternative role of tRNA sequences

    J. Theor. Biol.

    (2006)
  • M. Tanaka et al.

    Strand asymmetry in human mitochondrial DNA mutations

    Genomics

    (1994)
  • S. Anderson et al.

    Sequence and organization of the human mitochondrial genome

    Nature

    (1981)
  • M. Bernt et al.

    A method for computing an inventory of metazoan mitochondrial gene order rearrangements

    BMC Bioinforma.

    (2011)
  • M. Bernt et al.

    An algorithm for inferring mitogenome rearrangements in a phylogenetic tree

    Lect. Notes Comput. Sci

    (2008)
  • M. Bernt et al.

    Genetic aspects of mitochondrial genome evolution

    Mol. Phylogenet. Evol.

    (2012)
  • M. Bernt et al.

    MITOS: improved de novo metazoan mitochondrial genome annotation

    Mol. Phylogenet. Evol.

    (2012)
  • M. Bernt et al.

    Mitochondrial Genome Evolution

    (2013)
  • J.L. Boore

    Animal mitochondrial genomes

    Nucleic Acids Res.

    (1999)
  • J.L. Boore

    The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deuterostome animals

  • J.L. Boore

    Requirements and standards for organelle genome databases

    OMICS

    (2006)
  • J.L. Boore et al.

    Complete sequence, gene arrangement, and genetic code of mitochondrial DNA of the cephalochordate Branchiostoma floridae (Amphioxus)

    Mol. Biol. Evol.

    (1999)
  • S.J. Bourlat et al.

    The mitochondrial genome structure of Xenoturbella bocki (phylum Xenoturbellida) is ancestral within the deuterostomes

    BMC Evol. Biol.

    (2009)
  • D.A. Clayton

    Replication and transcription of vertebrate mitochondrial DNA

    Annu. Rev. Cell Biol.

    (1991)
  • Cited by (34)

    • First mitochondrial genomes of Chrysopetalidae (Annelida) from shallow-water and deep-sea chemosynthetic environments

      2022, Gene
      Citation Excerpt :

      Chrysopetalum debile differs from the Calamyzinae by having a much higher AT-skew. Nucleotide skews seem to be a side effect of replication processes, and can be used to detect the origin of replication in circular mitogenomes (Sahyoun et al., 2014). The start codon ATG is the most common among annelids, some derived taxa show a higher number of alternative start codons.

    • Evolutionary history of inversions in directional mutational pressures in crustacean mitochondrial genomes: Implications for evolutionary studies

      2021, Molecular Phylogenetics and Evolution
      Citation Excerpt :

      In some metazoan lineages, conserved motifs in noncoding regions (NCRs) can be used to identify origins of replication (Ghiselli et al., 2013), but a study has shown that CR sequences of crustaceans exhibit extremely low levels of conservedness, so the authors failed to infer the location of the CR directly from conserved sequence motifs (Pie et al., 2008). The skew magnitude is believed to exhibit a gradient corresponding to the duration of time that the H-strand spends in the mutagenic single-stranded state (Faith and Pollock, 2003; Reyes et al., 1998), so shift points in skew diagrams have been used to identify the OR (McLean et al., 1998; Min and Hickey, 2007a; Sahyoun et al., 2014; Touchon et al., 2005; Xia, 2012). However, the actual mitochondrial replication mechanisms are very complex, plastic, sometimes lineage-specific, and only partially understood (Fonseca et al., 2014; Reyes et al., 2013; Seligmann et al., 2006; Yasukawa and Kang, 2018), so skew patterns often produce noisy signals.

    View all citing articles on Scopus
    View full text