📢 Data Release Tuesday: Align TEV Protease Dataset
📊 We’re expanding the The Align Foundation data ecosystem again with ~30,000 high-quality GROQ-seq data points capturing TEV protease sequence–function relationships at scale.
To our knowledge, this is the largest mutational dataset on TEV protease to date. Notably, no comprehensive deep mutational scanning study across the full protein has been reported in over three decades, leaving key aspects of its functional landscape unexplored.
TEV protease is a cornerstone tool in biotechnology, known for its high substrate specificity, and this dataset provides a rich resource for enzyme engineering and ML-driven protein design.
This release was made possible by a strong cross-team effort, with key contributions from Erika Alden DeBenedictis, @Anjali Chadha,
@Dave Ross’s team at National Institute of Standards and Technology (NIST) and the DAMP Lab at Boston University
🔗 Access the dataset:
hubs.la/Q048Z3890
#OpenScience #SyntheticBiology #ProteinEngineering #BioAI #MachineLearning #AlignData #GROQSEQ #Protease #TEV