CytoVerse: Single-Cell AI Foundation Models in the Browser
1 CytoVerse runs a 23-million-cell reference atlas and a full foundation model inside your browser—no upload, no server, no privacy trade-off.
2 By compiling the SCimilarity scRNA-seq foundation model to ONNX and pairing it with IVFPQ compressed indexing, query cells are embedded and labeled locally at >5,000 cells/min on a laptop.
3 The server only hosts static files: pre-computed PQ codebooks, centroids and the ONNX weights. Once cached, every subsequent search is pure client-side computation, slashing cloud costs to near zero.
4 A lightweight “user reference” JSON lets collaborators exchange curated embedding windows instead of raw counts, keeping sensitive unpublished data private while still speaking a common ontology.
5 A built-in perturbation-based explainer scores each gene’s contribution to the embedding; inhibitory vs. excitatory neurons cleanly recover expected markers (ROBO2, GRIK1, LRP1B) without retraining.
6 Tested on 10k NGN2 neurons from the SSPsyGene consortium, CytoVerse reached >80 % accuracy against a 600k fetal-brain reference in under a minute, demonstrating real-time consortium-scale reuse.
7 Exact kNN would need ~20 GB RAM; IVFPQ cuts the working set to ~130 kB per partition with only a 13 % label mismatch at high compression, preserving biological structure for exploratory analysis.
8 The same pipeline generalizes to any ONNX-exportable embedding model (SIMS, scGPT, etc.), positioning the framework as an evergreen frontend for future single-cell foundation models.
💻Code:
github.com/braingeneers/cyto…
📜Paper:
biorxiv.org/content/10.64898…
#singlecell #scRNAseq #foundationmodel #browserAI #privacy #ONNX #WebAssembly #bioinformatics