PPIscreenML is a method for structure-based screening of protein-protein interactions using AlphaFold
1. Mischley et al. present PPIscreenML, a structure-based ML classifier that predicts whether a candidate protein pair truly interacts, using AlphaFold2-Multimer models as input rather than relying on sequence-only inference.
2. The key idea is to train explicitly on the interaction-vs-noninteraction task (not just “model quality”): PPIscreenML learns to separate AF2 models of real interacting heterodimers from AF2 models of “compelling decoys” that look structurally plausible.
3. Decoy generation is a central innovation: for each true PDB heterodimer, each partner is replaced by its closest structural analog (by TM-score) from a nonredundant set, then aligned into the original complex geometry—creating inactive pairs that mimic the geometry of true interfaces and are hard to dismiss by trivial heuristics.
4. Dataset scale and realism: 1481 nonredundant heterodimeric PDB complexes (<=30% sequence identity; excluding homodimers and antibody/antigen). Five AF2-Multimer v2.3 predictions per active and per decoy. Training includes only AF2 active models with DockQ >= 0.23 (to avoid learning from mis-docked “actives”), but the held-out test set keeps mis-docked actives to better reflect prospective screening.
5. Feature design blends AF2 confidence with physics-inspired energetics: 57 features total, spanning AF2 confidence metrics (pLDDT/pTM/PAE-derived terms), structural “counting” descriptors of interfaces, and Rosetta energy terms computed on the predicted complexes.
6. Model selection: several standard classifiers perform similarly, with gradient-boosted trees (XGBoost) best overall. On a completely held-out test set, the full feature model reaches ROC-AUC ~0.892 when scoring each candidate by the best of 5 AF2 models (a practical screening strategy).
7. A compact 7-feature version retains nearly the same performance (test ROC-AUC ~0.884), suggesting much of the signal is captured by a small set of interpretable interface cues: an interface-PAE statistic, interface charge count, and multiple Rosetta interfacial terms (LJ attractive/repulsive, solvation, electrostatics) plus a beta-sheet-related interface fraction.
8. Benchmark vs commonly used AF2-derived scores: on the same held-out test set, PPIscreenML outperforms iPTM and pDockQ for classifying interacting vs noninteracting pairs (AUC ~0.884 vs ~0.843 for iPTM and ~0.710 for pDockQ), highlighting the benefit of training specifically for screening rather than for structure-quality assessment.
9. Generalization test in a difficult “structurally conserved but selective” regime: across the TNF superfamily (18 ligands x 28 receptors = 504 pairs; only 36 known binders), AF2 can model many pairs in similar poses regardless of true binding. PPIscreenML nonetheless recapitulates specificity well (ROC-AUC ~0.93), with the top-scoring receptor matching a true interactor for 14/18 ligands (and within top-2 for 17/18), despite TNFSF not being in training (and training restricted to dimers).
💻Code:
github.com/victoria-mischley…
📜Paper:
doi.org/10.7554/eLife.98179
#ProteinProteinInteractions #AlphaFold #AlphaFoldMultimer #ComputationalBiology #StructuralBioinformatics #MachineLearning #Rosetta #Interactome #SystemsBiology #DrugDiscovery