De novo design of DNA origami with a generative diffusion model
1 Generative SNUPI introduces diffusion-model-based inverse design for DNA origami: given a user-defined line-based target geometry, it generates base-pair-level 3D structures that are physically plausible, then automatically produces scaffold routing and staple sequences for experimental fabrication.
2 A key bottleneck in generative DNA origami—lack of large standardized structural datasets—is addressed by training on simulated equilibrium conformations: 450 wireframe 2HB designs (216 2D, 234 3D) whose base-pair coordinates were generated with the SNUPI multiscale model.
3 The generative core is a denoising diffusion probabilistic model operating on base-pair coordinates as a point-cloud-like representation, implemented with a scalable graph Transformer using random graph construction and SE(3)-aware geometric handling to avoid alignment during training.
4 To follow a target shape, the model uses conditional guidance based on optimal transport: classifier-style gradients derived from Wasserstein Distance (WD) bias diffusion sampling so generated structures converge toward the provided geometry, improving shape fidelity and routing success.
5 Across 100 diverse conditional generations (hundreds to ~15,000 base pairs), the WD to the target drops from widely varying initial values (192.69–2178.54 nm) to a low final average of 2.21 ± 1.32 nm, indicating consistent convergence to the intended geometry across sizes and complexities.
6 The pipeline goes beyond shape generation by integrating a deterministic routing program: generated geometries are converted into loop representations, spanning trees, scaffold routes, and staple sets (20–60 nt), with bond-length regularization (0.34 ± 0.05 nm), and export to atomic models via CNDO → oxDNA → PDB post-processing.
7 Generative SNUPI also embeds fast, in-workflow physics evaluation using SNUPI-based simulation to predict equilibrium shapes and flexibility (RMSD, RMSF) without heavy molecular dynamics; for 100 designs, many cluster around RMSD 2.49 ± 1.29 nm and average RMSF 1.72 ± 0.15 nm, enabling pre-experimental screening.
8 Experimental validation shows the simulation-guided design loop is actionable: a “Face 1” dog design predicted to have locally high RMSF folds with high monomer yield yet shows AFM distortion; adding edges to stiffen flexible regions (“Face 2”) improves AFM agreement and reduces RMSD (4.07 ± 0.48 nm to 3.45 ± 0.35 nm).
9 The framework supports functional free-form mechanics and assembly: auxetic metastructures (rotating triangle, re-entrant) are designed and experimentally transformed open→closed using junction gaps plus site-specific connectors, achieving mean enclosed-area reductions of 34.9% and 47.3%; modular dog face/body components with matched curved interfaces assemble into dimers with >65% yield across combinations.
💻Code:
github.com/SSDL-SNU/Generati…
📜Paper:
doi.org/10.1038/s41467-026-7…
#DNANanotechnology #DNAOrigami #GenerativeAI #DiffusionModels #InverseDesign #ComputationalBiology #Biophysics #Nanorobotics #StructuralBiology #MachineLearning