Filter
Exclude
Time range
-
Near
Gumbel-Softmax Score and Flow Matching for Discrete Biological Sequence Generation 1. This paper introduces a novel generative framework called Gumbel-Softmax Flow and Score Matching for generating high-quality discrete biological sequences such as DNA, peptides, and proteins. The approach leverages a Gumbel-Softmax interpolant on the simplex to enable smooth transitions from noisy to clean distributions. 2. The key innovation is the use of a temperature-controlled Gumbel-Softmax distribution to define a velocity field that transports distributions from a uniform prior to a concentrated one-hot distribution over time. This avoids discretization errors and improves scalability to higher dimensions compared to previous methods. 3. The authors propose two main components: Gumbel-Softmax Flow Matching, which learns to predict the velocity field, and Gumbel-Softmax Score Matching, which estimates the gradient of the probability density. Both methods enable high-quality and diverse sequence generation. 4. A significant contribution is the introduction of Straight-Through Guided Flows (STGFlow), a training-free guidance method that uses pre-trained classifiers to steer the flow towards optimal sequences without requiring additional training of time-dependent classifiers. 5. The framework demonstrates competitive performance in conditional DNA promoter design, target-binding peptide design for rare disease treatment, and de novo protein sequence design, showcasing its potential for various biological applications. 6. The method effectively addresses limitations of previous discrete flow matching techniques, such as deterministic paths and lack of controllability at inference time, by introducing stochasticity and modular guidance. 7. The approach is scalable and can handle higher-dimensional simplex spaces, making it suitable for complex biological sequence generation tasks. It also provides a robust solution for controllable de novo sequence design. 📜Paper: openreview.net/forum?id=vx1u… #BiologicalSequenceGeneration #GumbelSoftmax #FlowMatching #ScoreMatching #DiscreteGeneration #ProteinDesign #PeptideDesign
1
2
11
2,150
Gumbel-Softmax Score and Flow Matching for Discrete Biological Sequence Generation 1. This paper introduces a novel generative framework called Gumbel-Softmax Flow and Score Matching for generating high-quality discrete biological sequences such as DNA, peptides, and proteins. The approach leverages a Gumbel-Softmax interpolant on the simplex to enable smooth transitions from noisy to clean distributions. 2. The key innovation is the use of a temperature-controlled Gumbel-Softmax distribution to define a velocity field that transports distributions from a uniform prior to a concentrated one-hot distribution over time. This avoids discretization errors and improves scalability to higher dimensions compared to previous methods. 3. The authors propose two main components: Gumbel-Softmax Flow Matching, which learns to predict the velocity field, and Gumbel-Softmax Score Matching, which estimates the gradient of the probability density. Both methods enable high-quality and diverse sequence generation. 4. A significant contribution is the introduction of Straight-Through Guided Flows (STGFlow), a training-free guidance method that uses pre-trained classifiers to steer the flow towards optimal sequences without requiring additional training of time-dependent classifiers. 5. The framework demonstrates competitive performance in conditional DNA promoter design, target-binding peptide design for rare disease treatment, and de novo protein sequence design, showcasing its potential for various biological applications. 6. The method effectively addresses limitations of previous discrete flow matching techniques, such as deterministic paths and lack of controllability at inference time, by introducing stochasticity and modular guidance. 7. The approach is scalable and can handle higher-dimensional simplex spaces, making it suitable for complex biological sequence generation tasks. It also provides a robust solution for controllable de novo sequence design. 📜Paper: openreview.net/forum?id=vx1u… #BiologicalSequenceGeneration #GumbelSoftmax #FlowMatching #ScoreMatching #DiscreteGeneration #ProteinDesign #PeptideDesign
6
934