Atomistic molecular simulations of protein folding

https://doi.org/10.1016/j.sbi.2011.12.001Get rights and content

Theory and experiment have provided answers to many of the fundamental questions of protein folding; a remaining challenge is an accurate, high-resolution picture of folding mechanism. Atomistic molecular simulations with explicit solvent are the most promising method for providing this information, by accounting more directly for the physical interactions that stabilize proteins. Although simulations of folding with such force fields are extremely challenging, they have become feasible as a result of recent advances in computational power, accuracy of the energy functions or ‘force fields’, and methods for improving sampling of folding events. I review the recent progress in these areas, and highlight future challenges and questions that we may hope to address with these methods. I also attempt to place atomistic models into the context of the energy landscape view of protein folding, and coarse-grained simulations.

Highlights

► Advances in sampling now allow atomistic simulations of protein folding. ► Developments of energy functions have reduced secondary structure bias. ► Atomistic simulations give an estimate of the transition path time. ► Comparison to experimental results is critical – but very challenging. ► Studying the unfolded state is one of the frontiers for all-atom simulation.

Introduction

Theory and coarse-grained molecular simulations can give powerful insights into the nature of protein folding. Many properties of folding can be understood from the hypothesis that the energy landscape of proteins is ‘funneled’ [1, 2, 3], with both the energy and configurational entropy smoothly decreasing as a function of the nativeness of the structure, and only minimal ‘frustration’ due to non-native contacts [4]. Such a landscape can only arise through evolution or design, since the landscapes of random heteropolymers will not have these features [5]. Funnel-based approaches, including both theory, and coarse-grained simulation models (Gō models [6]) have been very successful: they can explain the fact that proteins fold fast, the relative folding rates of different folds, and folding mechanism, at coarse resolution [7, 8, 9]. Such models can even be used to explain misfolding events, provided that these are driven by native-like interactions [10, 11].

An alternative to assuming a particular form for the energy landscape is to attempt to model constructively the specific physical interactions giving rise to the landscape. This type of model does not depend on knowing the folded structure and can therefore fully account for non-native contributions to the energy. The most feasible method of doing this is to use classical dynamics with an empirically parameterized potential energy surface, or ‘force field’ [12]. Both the protein and solvent are represented at atomic detail, with energy terms describing variations in energy due to bond and angle stretching, torsional rotations, dispersion, exchange and long-range electrostatics. In principle, if this is done accurately, all of the interactions present in the real system should be captured and the energy function will fold proteins. However, the disadvantage of this approach is that it comes at an enormous computational cost, relative to the more coarse-grained approaches described above. Furthermore, the actual energy functions used currently are certainly an approximation, neglecting effects such as electronic polarizability that are known to make a significant contribution to the total energy [13]. Therefore, one might ask whether undertaking such simulations is worth the effort, given that very useful results can be obtained easily by starting from funnel models derived from more high-level physical considerations. The devil's advocate might even ask: are atomistic simulations just very expensive and detailed, but possibly not very realistic, movies?

In fact, as a result of a general increase in computational power and the development of purpose-built computer hardware, the development of novel computational algorithms and improvements in energy functions, it has become possible in the last few years to fold a number of small proteins with all-atom simulations [14••, 15••, 16, 17, 18•]. In this article, I review the advances that have made this feat possible, with an emphasis on sampling algorithms and energy functions. I examine what additional insights we have gained into protein folding from running atomistic molecular dynamics simulations, and what we may hope to gain from them in future – in particular, the advantages that they may hold over coarse-grained approaches. Lastly, I consider how energy functions for folding might evolve in order to represent more accurately the molecular energy surface. I also discuss how far they might be systematically simplified, given that the relatively simple additive energy functions used today have already been quite successful.

Section snippets

Advances in sampling of folding events

The first approach successfully used to fold proteins and compute folding rates was based on distributed computing, where a large number of independent simulations are run on different computers. By running many short simulations of tens to hundreds of nanoseconds, a handful of folding events will be observed - this concept was pioneered in the Folding@Home project, initially using implicit solvent [19]. Because of the dependence on fast folding events, the rate calculation can be sensitive to

Improvements in energy functions for folding

Clearly the quality of any folding simulation will depend critically on the quality of the underlying energy function. Problems with older force fields have motivated a number of recent efforts to improve force field parameters, with particular emphasis on what is important for folding. A long-standing concern about all-atom force fields has been that many tend to favour formation of either α-helical or β-sheet structures. Common examples include the known bias of the Amber ff94 force-field [44

What have we learnt and what could we to learn?

The availability of atomistic simulation trajectories in principle allows a wealth of detail to be determined on folding mechanism. For example, in a recent benchmark study, Lindorff-Larsen et al. [15••] folded twelve different proteins to their native structure, obtaining in each case reasonable agreement with the experimentally determined folding rate. From these data, they were able to draw some general conclusions about the folding, in the context of the simulation model. They found the

How will we know if we have the right answer?

Obtaining a microscopically detailed picture of folding is clearly of little value if the picture is incorrect. The ultimate test of the accuracy of a simulation must be comparison with experimental observables. For folding, this means obtaining the correct equilibrium observables in each stable state, and the correct relaxation rates between stable states. If the experimental signal could be calculated, it would be straightforward to evaluate this from long equilibrium simulations [56] or from

Room for further improvements

Whilst folding simulations with atomistic force fields have been very successful, there are some well-known shortcomings that should be pointed out, which mainly pertain to the energy functions used. Firstly, although simulations can very often get the correct folding rate near the folding midpoint, and sometimes even the correct stability near 300 K, folding cooperativity is usually too weak. That is, even if the free energy may be approximately correct near 300 K, the temperature dependence is

Conclusion and outlook

Five years ago, it was not clear whether the same atomistic force field would, in general, be able to fold proteins from different structural classes in explicit solvent; and owing to computational limits, it was not possible to find out. Since then, very substantial advances in computing hardware and software have made it feasible to fold a range of proteins with different topologies. This result was facilitated by complementary refinements of the energy functions. Knowing this gives us a lot

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgement

David de Sancho and Kresten Lindorff-Larsen are thanked for helpful comments on the manuscript. The author is supported by a Royal Society University Research Fellowship.

References (78)

  • R.B. Best et al.

    Protein simulations with an optimized water model: cooperative helix formation and temperature-induced unfolded state collapse

    J Phys Chem B

    (2010)
  • V.A. Voelz et al.

    Unfolded state dynamics and structure of protein L characterized by simulation and experiment

    J Am Chem Soc

    (2010)
  • K.A. Merchant et al.

    Characterizing the unfolded states of proteins using single molecule FRET spectroscopy and molecular simulations

    Proc Natl Acad Sci USA

    (2007)
  • P.R. Callis et al.

    Ab initio prediction of tryptophan fluorescence quenching by protein electric field enabled electron transfer

    J Phys Chem B

    (2007)
  • J.N. Onuchic et al.

    Toward an outline of the topography of a realistic protein-folding funnel

    Proc Natl Acad Sci USA

    (1995)
  • K.A. Dill et al.

    From Levinthal to pathways to funnels

    Nat Struct Biol

    (1997)
  • P.G. Wolynes et al.

    Navigating the folding routes

    Science

    (1995)
  • M. Karplus

    Behind the folding funnel diagram

    Nat Chem Biol

    (2011)
  • J.D. Bryngelson et al.

    Intermediates and barrier crossing in a random energy model (with applications to protein folding)

    J Phys Chem

    (1989)
  • Y. Ueda et al.

    Studies on protein folding, unfolding and fluctuations by computer simulation. II. A three-dimensional lattice model of lysozyme

    Biopolymers

    (1978)
  • P.G. Wolynes

    Recent successes of the energy landscape theory of protein folding and function

    Q Rev Biophys

    (2005)
  • L.L. Chavez et al.

    Quantifying the roughness on the free energy landscape: entropic bottlenecks and protein folding rates

    J Am Chem Soc

    (2004)
  • M.B. Borgia et al.

    Single molecule fluorescence reveals sequence-specific misfolding in multidomain proteins

    Nature

    (2011)
  • A.D. MacKerell

    Empirical force fields for biological macromolecules: overview and issues

    J Comp Chem

    (2004)
  • A.J. Stone

    Intermolecular potentials

    Science

    (2008)
  • D.E. Shaw et al.

    Atomic-level characterization of the structural dynamics of proteins

    Science

    (2010)
  • K. Lindorff-Larsen et al.

    How fast-folding proteins fold

    Science

    (2011)
  • A.E. Garcia et al.

    Folding a protein on a computer: an atomic description of the folding pathway of protein A

    Proc Natl Acad Sci USA

    (2003)
  • D.L. Ensign et al.

    Heterogeneity even at the speed limit of folding: large scale molecular dynamics study of a fast-folding variant of the Villin headpiece

    J Mol Biol

    (2007)
  • R.B. Best et al.

    Microscopic events in β-hairpin folding from alternative unfolded ensembles

    Proc Natl Acad Sci USA

    (2011)
  • B. Hess et al.

    GROMACS4: algorithms for highly efficient, load-balanced, and scalable molecular simulation

    J Chem Theory Comput

    (2008)
  • J.C. Phillips et al.

    Scalable molecular dynamics with NAMD

    J Comp Chem

    (2005)
  • P.L. Freddolino et al.

    Ten-microsecond molecular dynamics simulation of a fast-folding WW domain

    Biophys J

    (2008)
  • P.L. Freddolino et al.

    Common structural transitions in explicit-solvent simulations of villin headpiece folding

    Biophys J

    (2009)
  • J. Kubelka et al.

    Chemical, physical and theoretical kinetics of an ultrafast folding protein

    Proc Natl Acad Sci USA

    (2008)
  • D.E. Shaw et al.

    Millisecond-scale molecular dynamics simulations on anton

  • P.G. Bolhuis et al.

    Transition path sampling: throwing ropes over rough mountain passes, in the dark

    Annu Rev Phys Chem

    (2002)
  • T.S. Van Erp et al.

    Elaborating transition interface sampling methods

    J Comp Phys

    (2005)
  • P.G. Bolhuis

    Transition-path sampling of beta-hairpin folding

    Proc Natl Acad Sci USA

    (2003)
  • Cited by (126)

    • Protein folding: how, why, and beyond

      2020, Protein Homeostasis Diseases: Mechanisms and Novel Therapies
    View all citing articles on Scopus
    View full text