Filter
Exclude
Time range
-
Near
ScProtoTransformer: Scalable Reference Mapping Across Molecules, Cells and Donors 1. The rapid accumulation of single-cell data has enabled comprehensive biological system characterization, but scalable reference mapping across different resolutions remains a major challenge. ScProtoTransformer addresses this by introducing a prototype-based Transformer architecture that achieves scalable mapping across molecular, cellular, and donor levels. 2. A key innovation is the knowledge-guided prototype tokenizer, which projects gene expression into biologically interpretable pathway prototypes. This reduces numerical batch effects while preserving biological semantic patterns, making it a powerful tool for cross-scale reference mapping. 3. ScProtoTransformer leverages knowledge distilled from foundation models and a dynamic supervised fine-tuning strategy. This allows it to inherit the knowledge of large-scale pretraining models without requiring extensive pretraining itself, significantly reducing computational costs. 4. Benchmark experiments demonstrate that ScProtoTransformer delivers competitive or superior performance compared to state-of-the-art methods across molecular, cell, and donor-level reference mapping tasks. It also provides interpretability through biologically meaningful prototypes. 5. The method supports multi-level reference mapping: gene embeddings enable molecular-level mapping, cell embeddings support cell-level mapping, and donor-level mapping is achieved by aggregating embeddings from the same donor sample. This lays the foundation for integrative analysis across different biological scales. 6. ScProtoTransformer shows strong performance in cross-modal and cross-batch integration tasks, outperforming specialized integration methods. It also demonstrates adaptability to spatial data without relying on non-molecular features like spatial coordinates. 7. The study includes comprehensive ablation experiments, validating the necessity of the prototype tokenizer, knowledge distillation loss, and dynamic SFT loss in achieving robust performance across different levels of biological analysis. 📜Paper: biorxiv.org/content/10.64898… #ComputationalBiology #SingleCellData #TransformerArchitecture #ReferenceMapping #Bioinformatics
2
11
1,227
An AI system to help scientists write expert-level empirical software 1. Researchers have developed an AI system that can create expert-level scientific software to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and navigate the large space of possible solutions. 2. The system demonstrated its effectiveness across a wide range of benchmarks, including discovering 40 novel methods for single-cell data analysis that outperformed top human-developed methods on a public leaderboard. 3. In epidemiology, the AI system generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. 4. The method also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting, and numerical solution of integrals. 5. By integrating complex research ideas from external sources, the system represents a significant step towards accelerating scientific progress. It can exhaustively and tirelessly carry out solution searches at an unprecedented scale. 6. The AI system is capable of combining ideas from different methods to create new, more effective solutions. For example, it combined two existing methods to improve batch integration in single-cell RNA sequencing data. 7. The system can also generate novel forecasting strategies for COVID-19 prediction by recombining and optimizing existing models. This demonstrates its ability to innovate and hybridize expert-level strategies. 8. In geospatial analysis, the AI system achieved state-of-the-art performance on the DLRSD benchmark, significantly outperforming reported results in recent academic papers. 9. For neural activity prediction in zebrafish, the system not only outperformed all methods on the current benchmark but also easily incorporated a biophysical simulator into a performant solution. 10. The AI system is a powerful tool for accelerating scientific discovery by systematically exploring a vast solution space to innovate, hybridize, and optimize expert-level solutions across multiple scientific fields. 📜Paper: arxiv.org/abs/2509.06503 #AISystem #ScientificSoftware #EmpiricalSoftware #TreeSearch #LargeLanguageModel #ScientificDiscovery #SingleCellData #COVID19Forecasting #GeospatialAnalysis #NeuralActivityPrediction #TimeSeriesForecasting #NumericalAnalysis
3
8
1,389
News &Views discussing the latest implementation of the computational tool #CellPhoneDB for inferring #cellcellcommunication from #singlecelldata bit.ly/42gSmXi

10
31
2,603
scELMo: Embeddings from Language Models are Good Learners for Single-cell Data Analysis 1. The paper introduces scELMo, a method leveraging embeddings from GPT-based large language models (LLMs) to improve single-cell data analysis tasks like clustering, batch effect correction, and cell-type annotation. 2. A core innovation is scELMo’s ability to incorporate biological knowledge through text embeddings, surpassing traditional models and domain-specific foundation models like Geneformer and scGPT in both zero-shot and fine-tuning settings. 3. scELMo achieves competitive results using a lightweight structure and low resource requirements, making it accessible for broader biological research without the need for extensive computational resources. 4. The zero-shot learning framework of scELMo excels in cell clustering and batch effect correction by embedding feature-level metadata into biological representations, validated with robust clustering metrics. 5. In fine-tuning tasks, scELMo demonstrates its potential for cell-type annotation and in-silico treatment analysis, identifying novel therapeutic targets and outperforming state-of-the-art methods in accuracy and precision. 6. Unlike resource-heavy models, scELMo combines gene embeddings with specific task models to achieve high adaptability, particularly in challenging tasks like perturbation analysis and modeling gene expression under varied conditions. 7. Through extensive evaluations, scELMo shows stability in embedding performance and captures functional heterogeneity across genes and cells, underlining its utility in diverse biological datasets. 8. The study highlights the integration of LLMs in biomedical data, pushing the boundaries of computational biology with scalable, interpretable, and efficient tools. @meSuper8 @jaszheg @HongyuZhao2 📜Paper: biorxiv.org/content/10.1101/… #SingleCellData #LanguageModels #Bioinformatics #AIinBiology #scELMo #DataScience
5
722
13 Nov 2024
🚀 Exciting update! DISCO now supports batch downloads—access data from 113,277,088 cells across 18,402 samples with ease. Perfect for training your LLMs or fueling other applications! 🌐📊 #SingleCellData #Bioinformatics #AI @LLM immunesinglecell.org/downloa…
2
2
432
🎃Happy Single CELLoween! 🎃CELLebrate with @10xGenomics! NO TRICKS. JUST TREATS. 😱 Is your lab being haunted by the ghosts of excessive pipetting? 😱 Are you being chased by the ghouls of lost dead cells? 😱 Do you think your lab is cursed? 😱 Do you feel as if you are turning into a Zombie? Time to befriend 🧛 Count Chromiula 🧛 a.k.a Chromium X! He loves blood samples, although he finds PBMCs especially tasty. So contact us today and let's chat how you can get SPOOKtacular results and BOOtiful data. #singlecell #blood #chromium #GEMX #singlecelldata #research #Science #multiomics #halloween2024 #cell
1
1
4
231
16 May 2024
At the @SEReumatologia Congress, Alejandro Gómez and @ToniJuliaC (@VHIR_) delved into #SjögrensSyndrome, #Lupus, and #RheumatoidArthritis using single cell tech and examined DocTIS's role in advancing #PsoriaticArthritis and #Lupus research through #Genomics and #SingleCellData
2
7
179
Ever wonder how #AI generates #images from text? It uses #DiffusionModels! Learn to apply same principles to generate #BiologicalSequences. Scan QR or use redcap.link/Reg_DiffModels_J… to register for our nanocourse! #DrugDesign #SingleCellData #Bioinformatics #ComputationalBiology
3
5
1,395
🌌 Charting the Unseen Territories of Single-Cell Data! 🧬 "scTopoGAN" by Akash Singh et al. introduces an unsupervised manifold alignment method, transcending limitations in integrating non-overlapping cells or features. doi.org/10.1093/bioadv/vbad1… #SingleCellData @ahmedElkoussy
2
3
500
Starting the day with the next Keynote Talk from @AedinCulhane: Matrix factorization for integrative ‘omics and single cell biology of cancer. @QBI_UCSF @UCDMedicine @MedicineAtUL #QBISBI2023 #computationaloncology #bioinformatics #singlecelldata
6
19
7,956
New R package cellxgene.census that gives access to 33M cells, the largest standardized aggregation of single-cell data. #CZCellxGene #SingleCellData
Today #CZCellxGene is releasing the R package cellxgene.census — it gives access from R to Census, the largest standardized aggregation of single-cell data, composed of 33M cells and 60K genes. You can easily export slices to Seurat or @Bioconductor. 🧵⬇️ bit.ly/44Uq5pm
3
118
#Singlecell RNA sequencing will help you to a treasure trove of information. But the amount of data can be a devil in disguise. Meet #tsne. This #machinelearning technique gets you a visual understanding of underlying patterns from #singlecelldata sets. hubs.ly/Q01tKQFz0
1
3
FYI-Educational webinar series focusing on bioinformatics analysis of single cell data (Free registration) #singlecelldata #bioinformatics linkedin.com/posts/singleron…

2
4
We developed and characterized an in vitro model of GSC to mimic this subpopulation. This cell population display downregulation of immune-associated pathways, in the in vitro model and #singlecelldata 3/5.
1
1
EMBEDR: Distinguishing signal from noise in single-cell omics data. #SingleCellData #DimensionalityReduction #QualityInData cell.com/patterns/fulltext/S… @Patterns_CP

4
6
Call for applications--due 3/15! #SingleCellAnalysis (July 1-16) Instructors: @DMChenowethLab, @mikemc43, @sydshaffer and @yeo_lab Free to apply; Financial aid available To apply, and for course info and COVID-related policies: bit.ly/singlecellcourse2022 #singlecelldata
4
8
Comprehensive evaluation of noise reduction methods for single-cell RNA sequencing data. #SingleCellData #scRNAseq #Normalization #BatchCorrection #ToolsBenchmarking academic.oup.com/bib/advance… @BriefingBioinfo

2
5