Generative continuous time model reveals epistatic signatures in protein evolution
1. A novel continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model has been introduced. This model allows for the simulation of realistic, family-specific evolutionary trajectories and direct comparison with independent-site models.
2. The study finds that epistasis significantly slows down evolution but does not change the average evolutionary rates at individual sites. This is explained by rate heterogeneity caused by context-dependence, where some positions have varying rates depending on the context, while others are essentially independent.
3. The authors show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences. This bias is stronger for slow-evolving sequences and when considering highly context-dependent sites.
4. The model assigns a sequence-specific rate of evolution, which is highly correlated with the sequence’s energy in the Potts model. Sequences with low energy (more probable and functional) tend to evolve slower, while those with high energy can rapidly mutate.
5. The study provides a method to rank positions based on how context-dependent their evolution is. It reveals that some positions are highly context-dependent, while others evolve independently, with the long-term average rates being similar across both groups.
6. The authors demonstrate that ignoring epistasis in phylogenetic inference leads to a systematic underestimation of evolutionary time. This bias could impact full-scale phylogenetic reconstruction by compressing branch lengths or distorting tree topology.
7. The core code for the evolutionary model is available at
github.com/PierreBarrat/Pott…, and scripts and data to reproduce the results are available at
github.com/PierreBarrat/Cont….
📜Paper:
biorxiv.org/content/10.1101/…
#ProteinEvolution #Epistasis #Phylogenetics #GenerativeModels #ContinuousTime #EvolutionaryDynamics