De novo peptides designed by ML and integer optimization that sit at the interface of biomolecular condensates
Inside cells, many biochemical reactions happen inside membraneless droplets called biomolecular condensates. Their interface (the thin layer between dense droplet and dilute phase) is increasingly recognized as a hotspot: it can accelerate disease-associated amyloid fibrils (hnRNPA1, FUS, α-synuclein), promote redox reactions, and modulate condensate size. Designing short peptides that selectively sit there, without disrupting the bulk, would give us a real handle on these systems. The problem is that classical surfactant intuition does not transfer: both phases are roughly 70% water, and the design space for a 30-residue peptide is 20^30.
Timo Schneider and coauthors build a pipeline that tackles this head-on. High-throughput coarse-grained MD (Mpipi force field) with adaptive biasing force quantifies, for each candidate, the free energy of partitioning between dilute, interface, and dense phases, plus the second virial coefficient B2 capturing self-association. A multi-output neural network is trained on 44 engineered sequence features (composition, charge and hydropathy decoration, aromatic patterning).
The key step: because all features are linear in the one-hot sequence and the network uses ReLU activations, the inverse design problem is reformulated as a Mixed-Integer Linear Programming problem (MILP, a classical optimization framework with provable global optimality), embedded via OMLT into Pyomo, and solved with Gurobi. AGGRESCAN enters as a hard constraint. The output is a true Pareto front, with formal guarantees that genetic algorithms cannot match.
Applied to three condensate targets (hnRNPA1-LCD, LAF-1-RGG, DDX4N), the pipeline converges on surfactant-like architectures: an aromatic-rich tail anchoring into the condensate via π-π and cation-π interactions, and a second tail excluded from the dense phase whose composition tracks the scaffold's net charge (polylysine for positively charged scaffolds, valine-rich for near-neutral DDX4N). Confocal microscopy confirms interfacial rims for all three designs, and the peptides shrink condensate size distributions while leaving bulk viscosity (FLIM) untouched.
For applied R&D, this matters beyond condensate biology. Combining trained neural networks with integer optimization is a general recipe for biomolecular inverse design where local optima are a real risk: peptide therapeutics modulating aggregation-prone condensates in neurodegeneration, and engineered sequences for compartmentalized biocatalysis in biopharma.
Paper: Schneider et al., Nature Communications (2026) — CC BY 4.0 |
doi.org/10.1038/s41467-026-7…