IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
1.IntFold introduces a new class of biomolecular structure predictors—combining AlphaFold 3-level accuracy with fine-grained control over predictions. Its standout innovation: modular adapters enabling guided modeling of allosteric states, structural constraints, and binding affinities, critical for drug discovery.
2.On the FoldBench benchmark, IntFold matches AlphaFold 3 in protein monomer and protein-protein interaction prediction. It outperforms all other competitors—including Boltz-2 and HelixFold 3—across protein-ligand, antibody-antigen, and nucleic acid tasks.
3.A specialized variant, IntFold , further improves antibody-antigen docking (success rate 43.2%, nearly matching AlphaFold 3’s 47.9%) and protein-ligand interface predictions (61.8%), closing critical performance gaps in therapeutic contexts.
4.For CDK2, a classic allosteric kinase target, general models failed to capture inhibitor-induced conformations. IntFold’s fine-tuned adapter correctly identified 4 out of 5 rare allosteric states, without degrading accuracy on common structures—showcasing robust controllability.
5.By incorporating prior knowledge as structural constraints (e.g., known binding pockets or epitopes), IntFold drastically improves predictions. On antibody-antigen interfaces, success rates jumped from 37.6% to 69.0%—a major boost for immunological modeling.
6.IntFold delivers accurate binding affinity prediction using a downstream adapter. On DAVIS and BindingDB, it beats both structure-based and sequence-based baselines. On CASP16 targets, its predictions showed higher correlation with experimental affinities than Boltz-2.
7.The team developed a custom Triton-based attention kernel—FlashAttentionPairBias—more memory-efficient and faster than industry kernels from DeepSpeed and NVIDIA, enabling larger and more diverse training inputs.
8.A training-free, model-agnostic ranking method based on internal structural similarity improves prediction selection. For antibody-antigen targets, it raised success by 3% over random selection—offering a simple yet effective upgrade to multi-sample inference.
9.Training insights revealed sources of instability in large-scale biomolecular models. Solutions included layernorm tweaks, a “skip-and-recover” mechanism for exploding gradients, and carefully chosen initialization strategies, highlighting practical engineering know-how.
10.IntFold was trained on a rich and diverse dataset, including distilled predictions, curated affinity measurements, antibody-antigen distillations, and orthosteric/allosteric CDK2 complexes—laying a strong foundation for both generalization and specialization.
11.Despite its strengths, IntFold faces challenges typical of high-complexity models—such as cubic-time attention and imperfect performance on the hardest interface types. The team aims to improve scalability and expand applications into de novo protein design.
💻Code:
github.com/IntelliGen-AI/Int…
📜Paper:
arxiv.org/abs/2507.02025v1
#ProteinFolding #DrugDiscovery #AlphaFold #Biotech #ComputationalBiology #DeepLearning #IntFold #AntibodyDesign #MolecularModeling #BindingAffinity #AllostericPrediction