Design and designability of protein-based assemblies
Introduction
Molecular self-assembly, if we learn to control it, can become a powerful means of nanotechnological fabrication, enabling fine control of organization at the atomic scale. Engineering of structures to support complex tissue growth, drug delivery, and materials design are just some applications to benefit from controlled assembly [1, 2, 3]. Living systems appear to have mastered this process to an impressive extent [4•, 5]. Protein assemblies are at the basis of numerous biological machines, the cytoskeleton and other mechanical structures, intracellular micro-compartments, and vaults [4•, 6, 7, 8]. Motivated by these and other exquisite biological structures, there has long been a desire to learn to engineer protein-based assemblies on demand. DNA origami represents a success story of using a biological polymer to encode bottom-up organization [9]. The prospect of achieving similar success with proteins is immensely attractive given their added functional versatility.
In recent years, there has been a growing interest in approaching the engineering of protein-based assemblies with tools of computational protein design — a problem in which considerable success has been reached, though it remains fundamentally unresolved [10, 11]. The primary driver of this interest is that computational protein design can potentially enable the specification of assemblies at a detailed atomistic level. An added challenge in designing assemblies, relative to individual proteins, is the need to tune inter-subunit interactions to encode the desired geometry at larger length scales. Researchers have frequently relied on existing stable protein units of structure, perturbing them minimally to favor desired assembly architectures [12]. The continued expansion of the Protein Data Bank (PDB) means that increasingly more units of structure are available for such engineering. On the other hand, two major difficulties remain. First, even small variations in interfacial geometry can lead to significant deviations from the intended assembly structure at long length scales. This is a part of the more general assembly specificity problem — that is, encoding the intended geometry over all other possibilities. Second, whereas the notion that the folded state corresponds to the lowest free-energy ensemble is fairly well accepted for single-domain proteins, this is less clear for the case of assemblies, complicating the assembly folding problem (i.e., identifying sequence features that lead to successful folding). Indeed, many examples are known where self-assembly does not appear to be under thermodynamic control. For example, crystallization of proteins and other molecules is generally thought to be under the kinetic control of nucleation [13] as is the assembly of amyloid fibrils [14]. Also, recent evidence suggests that natural assemblies are under evolutionary pressure to preserve the order of assembly, further illustrating the importance of kinetics [15].
Parallel problems have been encountered in de novo design of individual proteins. The specificity problem here has been challenging to address explicitly [16, 17], with most studies ignoring the possibility of alternative folded structures [18]. The folding problem has also been a challenge. For example, it has been difficult to predict a priori whether a given protein structure is likely to have a funnel-like free-energy landscape for any sequence — that is, whether the structure is designable. The existence of such funnels is thought to enable the robust folding of many native proteins. Thus, investigators have tended to reuse native structures or structural elements as templates for design, aiming to ensure their designability.
By analogy with single-protein design, it would seem advantageous to choose designable assembly architectures as targets. The designability of assembly building blocks is assured with the reuse of native-like structural units, but inter-subunit geometries are specific to each architecture and should also be designable. A variety of creative approaches have been reported that lead to assembly into intended structures, and several recent reviews summarize these results well [5, 12, 19, 20, 21, 22]. Here we focus on the design of protein-based assemblies on the atomic level, attempting to summarize commonalities in successful studies in the context of assembly specificity and folding problems.
Section snippets
Symmetry as a design element
One way of dealing with the specificity problem has been to employ the power of symmetry (see Figure 1A). The idea is that internal symmetries present in an assembly architecture reduce the number of unique structural features to design and amplify the significance of any inter-subunit interaction by the number of symmetry equivalent copies. Thus, even if the optimal interface between a pair of assembly subunits differs somewhat from that targeted in design, the collective effect of gaining a
Coiled coils as symmetry organizers
α-Helical coiled coils are perhaps naturally suited for organizing symmetry and architecture due to their ability to take on a variety of oligomerization and orientation states [27, 28]. Several coiled-coil toolkits have now been developed containing well-characterized peptides with a range of topologies [29, 30, 31]. Further, the ever-growing set of natural coiled-coil examples, now well organized and classified in the CC+ database [32], is serving as an additional source of inspiration for
Interface design and the assembly folding problem
In a classical paper Wolynes argues that symmetric structures are more likely to be designable, that is they are more likely to represent the energy landscape global minimum for a sequence [42]. Thus, choosing highly symmetric topologies may not only address specificity, as discussed above, but may also help simplify the problem of assembly folding. Nevertheless, symmetry alone is not enough and subtle details of inter-subunit interactions are likely critical. For example, with an elegant
Assembly designability
Recent work has showcased the versatility of not relying on existing oligomeric structures to encode inter-subunit interactions [44••, 45••, 46••, 49•]. On the other hand, reuse of known protein complexes for this purposes favors the designability and foldability of assembly interfaces — highly desirable properties. The optimal middle ground may be in reusing elements of native interfaces (i.e., interfacial motifs), but not necessarily full complexes. This seems especially apt in light of
Remaining challenges
Despite the highly impressive achievements summarized above, major challenges and methodological limitations remain. Recent studies have developed creative ways of alleviating both assembly folding and specificity problems, but no general solutions to these exist. In part, this is due to the difficulty of applying sufficiently accurate physical models in the context of design [10], but it is also due to the complex nature of thermodynamic and kinetic considerations governing the formation of
Note added in the proof
In very recent work, Baker and co-workers demonstrated that their computational method of modifying existing oligomeric units to design symmetric assemblies can be extended to allow for multi-component polyhedra [58••]. The authors designed five 24-subunit nano-particles, with crystal structures solved for four of these illustrating detailed agreement with design models.
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
Funding for work related to protein-based assemblies in the Grigoryan laboratory is provided by the Alfred P. Sloan Research Fellowship to GG (BR2013-038), the Neukom Institute for Computational Science at Dartmouth College, and the NSF (equipment grant CNS-1205521).
References (58)
- et al.
Directing the assembly of spatially organized multicomponent tissues from the bottom up
Trends Cell Biol
(2012) - et al.
Advanced materials and processing for drug delivery: the past and the future
Adv Drug Del Rev
(2013) - et al.
Directed cytoskeleton self-organization
Trends Cell Biol
(2012) - et al.
Designing biological compartmentalization
Trends Cell Biol
(2012) - et al.
Vault particles: a new generation of delivery nanodevices
Curr Opin Biotechnol
(2012) - et al.
DNA origami: a quantum leap for self-assembly of complex structures
Chem Soc Rev
(2011) - et al.
Principles for designing ordered protein assemblies
Trends Cell Biol
(2012) Kinetics and mechanisms of protein crystallization at the molecular level
Methods Mol Biol (Clifton, N.J.)
(2005)- et al.
Design of protein-interaction specificity gives selective bZIP-binding peptides
Nature
(2009) - et al.
Practical approaches to designing novel protein assemblies
Curr Opin Struct Biol
(2013)