Filter
Exclude
Time range
-
Near
Amortized Sampling with Transferable Normalizing Flows 1. Introducing PROSE, a 280 million parameter all-atom transferable normalizing flow model. This model is trained on a large corpus of peptide molecular dynamics trajectories and demonstrates unprecedented abilities to transfer to previously unseen systems of different amino acids, sizes, and temperatures, outperforming traditional molecular dynamics (MD) methods in terms of computational efficiency and sampling performance. 2. The core innovation of PROSE lies in its ability to draw zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving transferability across sequence length while retaining the efficient likelihood evaluation of normalizing flows. This is a significant advancement as it allows for the generation of high-quality samples without the need for retraining or fine-tuning for each new system, thus addressing a major limitation of conventional sampling methods. 3. The study introduces a novel dataset of molecular dynamics trajectories for peptide systems between 2 and 8 residues, consisting of 21,700 peptide sequences simulated for 200 ns each. This extensive dataset provides a rich resource for training and evaluating the model, enabling it to learn the complex distributions of molecular conformations and generalize to unseen systems. 4. PROSE builds on the TarFlow architecture, incorporating several architectural modifications to enhance its performance. These include adaptive system conditioning through adaptive layer normalization and SwiGLU-based transition blocks, as well as chemistry-aware sequence permutations that promote effective molecular modeling by updating the backbone atoms before the sidechains. 5. The efficacy of PROSE as a proposal distribution for different Monte Carlo algorithms is demonstrated through extensive empirical evaluation. The study finds that a simple importance sampling-based fine-tuning procedure can achieve superior performance to established methods such as sequential Monte Carlo on unseen tetrapeptides, highlighting the potential of PROSE for accurate and efficient sampling in various applications. 6. The scalability and transferability of PROSE are further confirmed by its ability to sample from the equilibrium distribution on previously unseen peptide systems of length up to 8 residues, surpassing the continuous normalizing flow-based transferable Boltzmann generator while generating proposals significantly faster. This opens up new possibilities for accelerated sampling in computational chemistry and statistical inference. 7. The authors open-source the PROSE codebase, model weights, and training dataset, facilitating further research and development in the field of amortized sampling methods. This open-source approach encourages collaboration and innovation, allowing other researchers to build upon the work and explore new applications and improvements. 💻Code: github.com/transferable-samp… 📜Paper: arxiv.org/abs/2508.18175v1 #AmortizedSampling #TransferableNormalizingFlows #PROSE #MolecularDynamics #ComputationalChemistry #StatisticalInference
3
795