Fast sampling of protein conformational dynamics
@ScienceAdvances
1. Sauer et al. show that the key collective variables (CVs) needed to drive enhanced sampling of protein conformational transitions are encoded in anharmonic low-frequency vibrations, and these CVs can be extracted from short unbiased MD without any prior knowledge of the transition.
2. Core idea: use FRESEAN (frequency-selective anharmonic mode analysis) at (near) zero frequency to isolate collective motions with minimal restoring forces—i.e., “paths of least resistance” for conformational change—avoiding the limitations of harmonic/quasiharmonic normal modes in the low-frequency, diffusive regime.
3. Practical pipeline: run 20 ns unbiased all-atom MD, align trajectories, coarse-grain to a 2-bead-per-residue representation (1 for Gly), compute velocity time-correlation matrices, Fourier transform to frequency domain, then take eigenvectors at zero frequency. Modes 1–6 correspond to translation/rotation and are discarded; modes 7 capture internal anharmonic low-frequency vibrations.
4. Reproducibility is a central result: across 5 independent 20 ns replicas per protein, the low-frequency modes (especially the 2D subspace spanned by modes 7–8) are consistently recovered, unlike PCA/quasiharmonic modes whose replica-to-replica agreement remains poor even with much longer trajectories.
5. Enhanced sampling step: use modes 7 and 8 as CVs in well-tempered metadynamics (100 ns per run; reported as <24 hours on a single GPU). Across 5 proteins × 5 replicas, 22/25 runs (88%) sample known “closed↔open” transitions within 100 ns; extending to 160 ns yields full sampling for all replicas.
6. Benchmark set spans diverse challenges: HEWL (disulfide-stabilized), HIV-1 protease (homodimer), MCL-1 (allosteric/druggable dynamics), ribose-binding protein (multi-domain hinge motion), and GDP-bound KRAS (switch-region dynamics). The same FRESEAN-to-metadynamics protocol is applied across all systems.
7. Free-energy landscapes (FES) become both fast and statistically controlled by running 20 parallel metadynamics replicas (20 × 100 ns) using the same FRESEAN CVs: single-run uncertainties are typically < ±10 kJ/mol, and averaging reduces standard error to < ±3 kJ/mol, enabling reproducible thermodynamic ensembles rather than just qualitative transitions.
8. Comparison to “hand-crafted” geometric CVs from prior literature is informative: biasing along FRESEAN modes often follows lower-free-energy transition routes and tends to keep sampling within the native folded ensemble, whereas geometric CVs can push systems into partially unfolded high-entropy states (most notably KRAS when biased by residue–residue distances).
9. The authors quantify cross-CV reweighting fidelity using Shannon entropy and Bhattacharyya coefficients: on average, ensembles generated by biasing along low-frequency vibrational CVs preserve at least as much (often more) information when reweighted into geometric-variable space than the reverse, supporting the claim that these vibrations are broadly suitable, system-agnostic CVs.
10. Implication for computational biology/ML: the method enables high-throughput generation of conformational ensembles and FESs (including mutants/conditions), helping address the dataset bottleneck for next-generation sequence→structure→dynamics models beyond single static folds or single thermodynamic states.
💻Code:
github.com/HeydenLabASU-coll…
📜Paper:
doi.org/10.1126/sciadv.aea46…
#MolecularDynamics #EnhancedSampling #Metadynamics #ProteinDynamics #FreeEnergy #ComputationalBiophysics #CollectiveVariables #FRESEAN #GROMACS #PLUMED