Design and designability of protein-based assemblies

https://doi.org/10.1016/j.sbi.2014.05.009Get rights and content

Highlights

  • We review recent successes in the atomistic design of protein-based assemblies.

  • Assembly specificity and assembly folding problems are two major challenges.

  • Symmetry and reuse of native-like structural modules have played significant roles.

  • Designability may be a common theme in successful assemblies.

  • Remaining methodological limitations mean assembly design is not yet robust.

Design of protein-based assemblies is an exciting frontier in molecular engineering. It can be seen as an extension of the protein design problem, but with some added hurdles. In recent years, much of the focus in the field has been on patterning existing protein structural units (proteins, oligomers, or structural motifs) to design diverse assembly geometries, focusing on symmetry to encode both “infinite” lattices and finite-sized supramolecular particles. Despite impressive successes, several key challenges remain. Among these are the specificity problem the need to engineer preference for the intended assembly geometry over all alternatives, and the folding problem — understanding what thermodynamic or kinetic features of assembly subunits and inter-subunit interfaces lead to successfully folding superstructures and how to encode these in the amino-acid sequence. Here we focus on recent results in the context of these two problems, summarizing commonalities in successful approaches. We find that natural designability of assembly elements (i.e., their compatibility with diverse populations of natural amino-acid sequences) may be a unifying property of successful designs.

Introduction

Molecular self-assembly, if we learn to control it, can become a powerful means of nanotechnological fabrication, enabling fine control of organization at the atomic scale. Engineering of structures to support complex tissue growth, drug delivery, and materials design are just some applications to benefit from controlled assembly [1, 2, 3]. Living systems appear to have mastered this process to an impressive extent [4•, 5]. Protein assemblies are at the basis of numerous biological machines, the cytoskeleton and other mechanical structures, intracellular micro-compartments, and vaults [4•, 6, 7, 8]. Motivated by these and other exquisite biological structures, there has long been a desire to learn to engineer protein-based assemblies on demand. DNA origami represents a success story of using a biological polymer to encode bottom-up organization [9]. The prospect of achieving similar success with proteins is immensely attractive given their added functional versatility.

In recent years, there has been a growing interest in approaching the engineering of protein-based assemblies with tools of computational protein design — a problem in which considerable success has been reached, though it remains fundamentally unresolved [10, 11]. The primary driver of this interest is that computational protein design can potentially enable the specification of assemblies at a detailed atomistic level. An added challenge in designing assemblies, relative to individual proteins, is the need to tune inter-subunit interactions to encode the desired geometry at larger length scales. Researchers have frequently relied on existing stable protein units of structure, perturbing them minimally to favor desired assembly architectures [12]. The continued expansion of the Protein Data Bank (PDB) means that increasingly more units of structure are available for such engineering. On the other hand, two major difficulties remain. First, even small variations in interfacial geometry can lead to significant deviations from the intended assembly structure at long length scales. This is a part of the more general assembly specificity problem — that is, encoding the intended geometry over all other possibilities. Second, whereas the notion that the folded state corresponds to the lowest free-energy ensemble is fairly well accepted for single-domain proteins, this is less clear for the case of assemblies, complicating the assembly folding problem (i.e., identifying sequence features that lead to successful folding). Indeed, many examples are known where self-assembly does not appear to be under thermodynamic control. For example, crystallization of proteins and other molecules is generally thought to be under the kinetic control of nucleation [13] as is the assembly of amyloid fibrils [14]. Also, recent evidence suggests that natural assemblies are under evolutionary pressure to preserve the order of assembly, further illustrating the importance of kinetics [15].

Parallel problems have been encountered in de novo design of individual proteins. The specificity problem here has been challenging to address explicitly [16, 17], with most studies ignoring the possibility of alternative folded structures [18]. The folding problem has also been a challenge. For example, it has been difficult to predict a priori whether a given protein structure is likely to have a funnel-like free-energy landscape for any sequence — that is, whether the structure is designable. The existence of such funnels is thought to enable the robust folding of many native proteins. Thus, investigators have tended to reuse native structures or structural elements as templates for design, aiming to ensure their designability.

By analogy with single-protein design, it would seem advantageous to choose designable assembly architectures as targets. The designability of assembly building blocks is assured with the reuse of native-like structural units, but inter-subunit geometries are specific to each architecture and should also be designable. A variety of creative approaches have been reported that lead to assembly into intended structures, and several recent reviews summarize these results well [5, 12, 19, 20, 21, 22]. Here we focus on the design of protein-based assemblies on the atomic level, attempting to summarize commonalities in successful studies in the context of assembly specificity and folding problems.

Section snippets

Symmetry as a design element

One way of dealing with the specificity problem has been to employ the power of symmetry (see Figure 1A). The idea is that internal symmetries present in an assembly architecture reduce the number of unique structural features to design and amplify the significance of any inter-subunit interaction by the number of symmetry equivalent copies. Thus, even if the optimal interface between a pair of assembly subunits differs somewhat from that targeted in design, the collective effect of gaining a

Coiled coils as symmetry organizers

α-Helical coiled coils are perhaps naturally suited for organizing symmetry and architecture due to their ability to take on a variety of oligomerization and orientation states [27, 28]. Several coiled-coil toolkits have now been developed containing well-characterized peptides with a range of topologies [29, 30, 31]. Further, the ever-growing set of natural coiled-coil examples, now well organized and classified in the CC+ database [32], is serving as an additional source of inspiration for

Interface design and the assembly folding problem

In a classical paper Wolynes argues that symmetric structures are more likely to be designable, that is they are more likely to represent the energy landscape global minimum for a sequence [42]. Thus, choosing highly symmetric topologies may not only address specificity, as discussed above, but may also help simplify the problem of assembly folding. Nevertheless, symmetry alone is not enough and subtle details of inter-subunit interactions are likely critical. For example, with an elegant

Assembly designability

Recent work has showcased the versatility of not relying on existing oligomeric structures to encode inter-subunit interactions [44••, 45••, 46••, 49•]. On the other hand, reuse of known protein complexes for this purposes favors the designability and foldability of assembly interfaces — highly desirable properties. The optimal middle ground may be in reusing elements of native interfaces (i.e., interfacial motifs), but not necessarily full complexes. This seems especially apt in light of

Remaining challenges

Despite the highly impressive achievements summarized above, major challenges and methodological limitations remain. Recent studies have developed creative ways of alleviating both assembly folding and specificity problems, but no general solutions to these exist. In part, this is due to the difficulty of applying sufficiently accurate physical models in the context of design [10], but it is also due to the complex nature of thermodynamic and kinetic considerations governing the formation of

Note added in the proof

In very recent work, Baker and co-workers demonstrated that their computational method of modifying existing oligomeric units to design symmetric assemblies can be extended to allow for multi-component polyhedra [58••]. The authors designed five 24-subunit nano-particles, with crystal structures solved for four of these illustrating detailed agreement with design models.

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

Funding for work related to protein-based assemblies in the Grigoryan laboratory is provided by the Alfred P. Sloan Research Fellowship to GG (BR2013-038), the Neukom Institute for Computational Science at Dartmouth College, and the NSF (equipment grant CNS-1205521).

References (58)

  • S. Božič et al.

    New designed protein assemblies

    Curr Opin Chem Biol

    (2013)
  • M. Kazunori

    Rational design of self-assembled proteins and peptides for nano- and micro-sized architectures

    RSC Adv

    (2014)
  • J. Padilla et al.

    Nanohedra: using symmetry to design self assembling protein cages, layers, crystals, and filaments

    Proc Natl Acad Sci U S A

    (2001)
  • J. Fletcher et al.

    A basis set of de novo coiled-coil peptide oligomers for rational protein design and synthetic biology

    ACS Synth Biol

    (2012)
  • T. Sharp et al.

    Cryo-transmission electron microscopy structure of a gigadalton peptide fiber of de novo design

    Proc Natl Acad Sci U S A

    (2012)
  • J. Karanicolas et al.

    A de novo protein binding pair by computational design and directed evolution

    Mol Cell

    (2011)
  • B. Der et al.

    Strategies to control the binding mode of de novo designed protein interactions

    Curr Opin Struct Biol

    (2013)
  • D. Woolfson et al.

    More than just bare scaffolds: towards multi-component and decorated fibrous biomaterials

    Chem Soc Rev

    (2010)
  • E. Jones et al.

    Structure and function in complex macromolecular assemblies: some evolutionary themes

    Curr Opin Struct Biol

    (2012)
  • S. Howorka

    Rationally engineering natural protein assemblies in nanobiotechnology

    Curr Opin Biotechnol

    (2011)
  • Z. Li et al.

    Energy functions in de novo protein design: current challenges and future prospects

    Annu Rev Biophys

    (2013)
  • R. Pantazes et al.

    Recent advances in computational protein design

    Curr Opin Struct Biol

    (2011)
  • R. Pellarin et al.

    Amyloid fibril polymorphism is under kinetic control

    J Am Chem Soc

    (2010)
  • J. Marsh et al.

    Protein complexes are under evolutionary selection to assemble via ordered pathways

    Cell

    (2013)
  • J.J. Havranek et al.

    Automated design of specificity in molecular recognition

    Nat Struct Biol

    (2002)
  • J. Havranek

    Specificity in computational protein design

    J Biol Chem

    (2010)
  • J. Sinclair

    Constructing arrays of proteins

    Curr Opin Chem Biol

    (2013)
  • Y.-T. Lai et al.

    Structure of a 16-nm cage designed by using protein oligomers

    Science (New York, N.Y.)

    (2012)
  • J. Sinclair et al.

    Generation of protein lattices by fusing proteins with matching rotational symmetry

    Nat Nanotechnol

    (2011)
  • Cited by (0)

    View full text