What seemed like an intractable problem is now possible: To design proteins with a specified nonlinear mechanical response, capturing complex folding and unfolding mechanisms in singe and few-shot computations.
We present ForceGen, an end-to-end algorithm for de novo protein generation based on nonlinear mechanical unfolding responses. Rooted in the physics of protein mechanics, this generative strategy provides a powerful way to design new proteins rapidly, including exquisite and rapid predictions about their dynamical behavior.
Proteins, like any other mechanical object, respond to forces in peculiar ways. Think of the different response you'd get from pulling on a steel cable versus pulling on a rubber band, or the difference between honey and glass. Now, we can design proteins with a set of desirable mechanical characteristics, with applications from health to sustainable plastics.
The key to solving this problem was to integrate a protein language model with denoising diffusion methods, and using accurate atomistic-level physical simulation data to endow the model a first-principles understanding. ForceGen can solve both forward and inverse tasks: In the forward task, we can predict how stable a protein is, how it will unfold and what the forces involved are, all given just the sequence of amino acids. In the inverse task, we can design new proteins that meet complex nonlinear mechanical signature targets.
Read the paper, led by
@LAMM_MIT postdoc Bo Ni, published in Science Advances:
science.org/doi/10.1126/scia…
Why do we care about the mechanics of proteins?
The mechanics of proteins are critical elements of many living systems - as evidenced in many studies of mechanobiology. Through evolution, nature has presented a set of remarkable protein materials with unique mechanical functions like elastins, silks, keratins or collagens that play crucial roles in biology. However, going beyond natural designs to discover proteins that meet specified mechanical properties remains challenging. So far, the only way to do this was to use existing evolutionary concepts or to manually alter proteins.
With our new generative model we can directly design proteins to meet complex nonlinear mechanical property-design objectives. ForceGen leverages deep knowledge on protein sequences from a pretrained protein language model and maps mechanical unfolding responses to create proteins.
Via full-atom molecular simulations for direct validation from physical and chemical principles, we demonstrate that the designed proteins are de novo, and fulfill the targeted mechanical properties, including unfolding energy and mechanical strength, and a detailed unfolding force-separation curves. ForceGen offers rapid pathways to explore the enormous mechanobiological protein sequence space unconstrained by biological synthesis, to enable the discovery of new protein materials with superior mechanical properties.
B. Ni, D.L. Kaplan, M.J. Buehler, ForceGen: End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a language diffusion model. Sci. Adv. 10, eadl4000 (2024). DOI: 10.1126/sciadv.adl4000
Codes and model weights available
@huggingface:
huggingface.co/lamm-mit/Prot…
@KaplanLab_Tufts