All-atom inverse protein folding through discrete flow matching
1. ADFLIP, a novel generative model for protein sequence design, has been introduced by Yi et al. This model leverages discrete flow matching to design protein sequences conditioned on all-atom structural contexts, including non-protein elements like ligands, nucleotides, and metal ions. It addresses the challenges of designing sequences for complex biomolecular assemblies and dynamic protein complexes.
2. ADFLIP incorporates predicted amino acid side chains progressively during sequence generation, providing crucial structural context that defines specific interactions with other molecules. This approach is particularly innovative for designing protein-ligand interactions and dynamic complexes with multiple structural states.
3. The model employs a multi-scale graph neural network as the denoising backbone, integrating both atom and residue-level information. This allows ADFLIP to capture detailed structural nuances, leading to state-of-the-art performance in single-structure and multi-structure inverse folding tasks.
4. ADFLIP implements training-free classifier guidance sampling, enabling the integration of arbitrary pre-trained models to optimize designed sequences for desired protein properties. This flexibility allows researchers to steer sequence generation towards specific outcomes without retraining the model.
5. The performance of ADFLIP was evaluated on protein complexes with small-molecule ligands, nucleotides, and metal ions, including dynamic complexes determined by NMR. The model demonstrated excellent potential for all-atom protein design, outperforming existing methods in sequence recovery and foldability.
6. ADFLIPโs ability to handle dynamic protein complexes through ensemble sampling across multiple structural states is a significant step forward in protein design. This capability is essential for designing proteins that undergo conformational changes during their functional cycles.
7. The training-free guidance sampling mechanism allows ADFLIP to leverage powerful existing regressors for guided generation, making it a versatile tool for protein design. This approach was demonstrated by guiding sequence generation towards higher predicted binding affinities using DSMBind.
๐Paper:
raw.githubusercontent.com/mlโฆ
#ProteinDesign #DiscreteFlowMatching #AllAtomModeling #Bioinformatics #MachineLearning