Modeling Atomic Conformational Ensembles of Proteins via Test-Time Supervision of Boltz-2 on Cryo-EM Density Maps
1. The paper proposes a unified alternative to the standard two-stage pipeline (build atomic models from cryo-EM maps, then train ensemble predictors): it fine-tunes a pre-trained structure model (Boltz-2) directly on raw cryo-EM density map ensembles, so supervision happens in map space rather than requiring ground-truth atomic ensembles.
2. The resulting method, CryoSampler, targets “atomic ensemble model building”: given an ensemble of heterogeneous cryo-EM volumes from a single experiment (multiple conformational states), it outputs a matching ensemble of atomistic conformations with strong map–model agreement and chemically plausible geometry.
3. Key design choice: Boltz-2 is used as a frozen static trunk to produce a reference structure Xref, and CryoSampler learns deformations (per-atom coordinate offsets) on top of that reference to explain the cryo-EM density variability.
4. Training stage 1 is a 3D spatial VAE that encodes each cryo-EM map into a latent grid and decodes it into a 3D feature grid; a lightweight per-atom MLP queries this grid at the Boltz-2 reference coordinates to predict atomic offsets, producing an atomic model that is differentiably rendered back into a density map via a cryo-EM forward model.
5. Supervision is end-to-end through a volumetric reconstruction objective: the main signal is 1 − NCC between predicted and observed maps (map–map agreement), which avoids needing experimentally curated atomic conformations for each heterogeneous cryo-EM state.
6. To prevent “fitting the map at all costs,” CryoSampler adds structural regularization: differentiable MolProbity-style terms (clashes, Ramachandran, rotamers) plus a backbone soft rigid-body constraint that preserves local secondary-structure geometry relative to the Boltz-2 reference.
7. Training stage 2 freezes the VAE and learns a latent diffusion model using flow matching on the VAE latent codes; this provides a generative prior over the learned latent space, enabling sampling of multiple conformations rather than only reconstructing given maps.
8. Model-building results on four systems (TRPV3, integrin αVβ8, neurokinin-1 GPCR, human P-glycoprotein) show higher map–model fit than baselines (Boltz-2 alone, E2GMM, ModelAngelo, CryoBoltz where feasible), while maintaining competitive stereochemistry; the paper notes an explicit tradeoff where removing structural losses increases correlation but yields severely invalid geometry.
9. A practical systems point: some inference-time cryo-EM-guided baselines can fail on large complexes (e.g., CryoBoltz OOM on TRPV3 at 2,556 residues), while CryoSampler’s training-time adaptation is presented as a route to accurate fitting with controlled stereochemistry.
10. Beyond per-experiment overfitting, the paper reports preliminary in-domain generalization within TRP channels: after training on TRPV3 map states, CryoSampler samples an ensemble for TRPV5 without using TRPV5 cryo-EM at inference, and matches held-out TRPV5 maps better than Boltz-2 sampling or normal mode analysis (evaluated via ensemble-level precision/recall and Wasserstein distance derived from CCvolume).
📜Paper:
arxiv.org/abs/2605.09832
#CryoEM #ProteinDynamics #DiffusionModels #ProteinStructure #Boltz2 #GenerativeAI #ComputationalBiology #StructuralBiology #ModelBuilding #EnsembleModeling