The purpose of protein design is to identify novel protein activity or to leverage advanced knowledge of a protein's function.
Image Credit: Design_Cells/Shutterstock.com
There are two methods of protein design. The first is de novo design, in which proteins can be designed from scratch. The second is protein redesign, which is based on performing calculated variants of a known protein sequence and structure.
The goal of protein design is to find a protein sequence, which will precisely fold to a target structure.
Overview of protein design
Proteins can fold into a 3D structure, which is determined from their amino acid sequence. The side chain of each amino acid can rotate along up to 4 dihedral angles relative to the protein backbone.
The protein 3D structure is defined by the backbone and the set of side-chain rotations. This is usually referred to as protein conformation, which plays a pivotal role in the determination of chemical reactivity and biological function of a given protein.
There are several challenges with computational protein design. The first obstacle is the exponential size of the conformational and protein sequence space that needs to be investigated.
The second is to predict an accurate structure for a particular sequence. Thus, the issue of protein design is usually managed as an inverse folding problem to identify the amino acid sequence, which can ideally fold into a target 3D-scaffold to match the design objective.
The conformation stability can be analyzed by evaluation of the conformation energy, a stable fold being of minimum energy. Two approximations in computational protein design are commonly used.
First, it is proposed that the resulting designed protein still has the general folding of the selected scaffold: the backbone of the protein is accounted to be fixed. Then, computational biologists will choose specific positions to modify the amino acid via changing the side chain.
Second, the continuous domain, which is the domain of conformations available to each amino acid side chain, could be approximated via using some distinct conformations set by their inner dihedral angles.
These conformations or rotamers are obtained from the most common conformations in the experimental repository of recognized protein structures such as Protein Data Bank.
Computational protein design is formulated as the problem of determining the minimum energy conformation through the mutation of a particular subset of amino acid residues.
Global minimum energy conformation is the conformation with the lowest energy out of all possible conformations. Optimizing drug design requires a computationally tractable energetic model to estimate the energy of any rotamer combinations.
Optimizing computational techniques is necessary to identify the sequence-conformation model of global minimum energy conformation.
Early work in protein design
In the late 1980s, pioneering work of DeGrado and colleagues designed a 4-helix bundle protein based on model building and energy minimization studies. The design of the first helices was based on the propensity of amino acids.
Then, helix-helix interaction interfaces were established. Linking all the four helices together was the last step. Checking of designs at each step is mandatory to verify the desired characters. A few years later, a complementary approach was adopted by Richardsons and colleagues, who also created a 4-helix bundle protein.
Yet, they aimed to maximize the number of amino acid types designed as a way to make the sequence natural as possible. Both groups (DeGrado and Richardsons) used simple solution methods such as circular dichroism to make monomeric, compact helical proteins.
Both methods were dealing with the problem of protein folding in reverse. This means that they did not aim to predict 3D structure, but only to identify the sequence of amino acids that was compatible with a specific fold. This is considered to be the first step in protein design.
Recent work in protein design
The next step in optimizing protein design is to design metal ion binding sites. This is a very important step not only because many protein functions are dependent on metal ions such as electron transfer or catalysis, but also due to various spectroscopic techniques could be applied to identify binding geometry of proteins in solution
Baker and colleagues created and experimentally validated “Top7, which is a novel protein with arbitrarily chosen three-dimensional structures thanks to their Rosetta-Design software.
Using 3- and 9-residue fragments obtained from the Protein Data Bank, they successfully constructed the protein scaffold. The optimal combinations were then chosen via Monte Carlo optimization of some energetic items such as β-strand hydrogen bonding, hydrophobic burial, and side-chain rotamers.
DeGrado and colleagues developed “knowledge-based” approaches for optimizing of construct design for many functions: protein crystals, surface-organizing peptide superstructures, and transmembrane-binding peptides.
Their computed helical anti-membrane protein (CHAMP) protocol facilitated the design of α- and β-helical peptides that can precisely associate with a transmembrane helix of the selected protein.
Using fluorescence resonance energy transfer, the binding of designed CHAMP transmembrane peptides to their target integrin transmembrane peptide has been confirmed in micelles.
Additionally, the efficacy and specificity of CHAMP peptides binding to their target integrin transmembrane peptide was also proven through a biological activity assay.
References
- Allouche, D., André, I., Barbe, S., Davies, J., de Givry, S., Katsirelos, G., O'Sullivan, B., Prestwich, S., Schiex, T. and Traoré, S., 2014. Computational protein design as an optimization problem. Artificial Intelligence, 212, pp.59-79.
- Butterfoss, G.L., and Kuhlman, B., 2006. Computer-based design of novel protein structures. Annu. Rev. Biophys. Biomol. Struct., 35, pp.49-65.
- Hecht, M.H., Richardson, J.S., Richardson, D.C., and Ogden, R.C., 1990. De novo design, expression, and characterization of Felix: a four-helix bundle protein of native-like sequence. Science, 249(4971), pp.884-891.
- Ho, S.P., and DeGrado, W.F., 1987. Design of a 4-helix bundle protein: synthesis of peptides which self-associate into a helical protein. Journal of the American Chemical Society, 109(22), pp.6751-6758.
- Kuhlman, B., Dantas, G., Ireton, G.C., Varani, G., Stoddard, B.L., and Baker, D., 2003. Design of a novel globular protein fold with atomic-level accuracy. science, 302(5649), pp.1364-1368.
- Regan, L., and Clarke, N.D., 1990. Tetrahedral zinc (II)-binding site introduced into a designed protein. Biochemistry, 29(49), pp.10878-10883.
- Regan, L., 1993. The design of metal-binding sites in proteins. Annual review of biophysics and biomolecular structure, 22(1), pp.257-281.
- Regan, L., Caballero, D., Hinrichsen, M.R., Virrueta, A., Williams, D.M. and O'hern, C.S., 2015. Protein design: past, present, and future. Peptide Science, 104(4), pp.334-350.
- Shandler, S.J., Korendovych, I.V., Moore, D.T., Smith-Dupont, K.B., Streu, C.N., Litvinov, R.I., Billings, P.C., Gai, F., Bennett, J.S. and DeGrado, W.F., 2011. Computational design of a β-peptide that targets transmembrane helices. Journal of the American Chemical Society, 133(32), pp.12378-12381.
- Walters, R.F.S., and DeGrado, W.F., 2006. Helix-packing motifs in membrane proteins. Proceedings of the National Academy of Sciences, 103(37), pp.13658-13663.
- Yin, H., Slusky, J.S., Berger, B.W., Walters, R.S., Vilaire, G., Litvinov, R.I., Lear, J.D., Caputo, G.A., Bennett, J.S. and DeGrado, W.F., 2007. Computational design of peptides that target transmembrane helices. Science, 315(5820), pp.1817-1822.
- Zhou, J., Panaitiu, A.E., and Grigoryan, G., 2020. A general-purpose protein design framework based on mining sequence-structure relationships in known protein structures. Proceedings of the National Academy of Sciences, 117(2), pp.1059-1068.