AlphaFind v2: Similarity Search in AlphaFold DB and TED Domains across Structural Contexts
1 AlphaFind v2 is a web application designed for fast structure-based similarity search in the AlphaFold Database of predicted protein structures, addressing the computational challenges of large-scale 3D structure comparison.
2 It combines fast pre-filtering using protein embeddings that preserve structural information with refinement via US-align, balancing search speed and biological relevance.
3 The tool offers six complementary search modes, including full-protein chain search, pLDDT-filtered searches at 70%, 80%, and 90% thresholds, TED domain search, and TED multidomain search.
4 It supports optional filtering by organism, taxonomy ID, or CATH label, and links search results to corresponding experimental protein structures.
5 The approximate search phase delivers results in seconds, with full structural refinement completed in under a minute on average, outperforming tools like FoldSeek Server and Merizo-search in both speed and average TM-Score.
6 Key applications include identifying homologous proteins in disordered regions via pLDDT filtering and detecting conserved multidomain architectures as demonstrated in case studies of PIN3 and NCAM1 proteins.
7 The web server is built with a Python backend, Flask REST API, Celery asynchronous tasks, OpenSearch vector database, and Kubernetes deployment for scalable performance.
8 AlphaFind v2 uses AlphaFold DB version 4 and precomputed embeddings, with all functionality freely accessible to users without login requirements.
📜Paper:
biorxiv.org/content/10.64898…
#AlphaFind #ProteinStructure #StructuralBiology #AlphaFold #Bioinformatics #StructuralSimilarity