Filter
Exclude
Time range
-
Near
4/ Pipeline: max-attention per head per example, per-head median binarization, pseudolikelihood Ising, spectral clustering — Bhalla et al.'s recipe, minimally adapted. Then closure: ablate the discovered community, compare per-example damage to five matched random head-sets, report z on loss, accuracy, target-logit.
1
28
Day 9 📚 Today I learnt about Binning and Binarization topic in Machine Learning. Watched an old orientation session video of MLP project of IITM BS Degree program. Also understood several things about Kaggle Competitions.
46
Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization Yusuke Ohtsubo, Kota Dohi, Koichiro Yawata, Koki Takeshita, Tatsuya Sasaki arxiv.org/abs/2606.02434 [𝚌𝚜.𝙰𝙸]
1
3
156
Yessss! Wanted to share #STAMPede - a new Python toolkit for exploring and analyzing #STAMP data by @mhlangalab’s talented duo Niels Velthuijs and Siebren Frölich (mhlangalab.org) If you haven't heard of #STAMP yet…check this paper out cell.com/cell/fulltext/S0092……it's a single-cell method published in @CellCellPress last year, repurposing spatial-omics imagers to profile millions of cells at the lowest cost yet, without compromising data quality - no sequencing required. It retains cell morphology and supports multimodal profiling (RNA, protein, and HnE) all in one workflow. But powerful data needs powerful tools. That's where #STAMPede comes in. Built to handle the massive, shallow-depth datasets that #STAMP generates, #STAMPede gives you: → Familiar scanpy-style syntax so there's almost no learning curve → Full pipeline support: QC, filtering, binarization, dimensionality reduction, clustering, and differential expression → Easy installation via pip or conda Whether you're exploring rare cell populations or running large-scale perturbation studies, STAMPede is designed to keep up. Check it out 👉 siebrenf.github.io/stampede/ @DrJasPlummer #SingleCell #Bioinformatics #Genomics #OpenSource #STAMP #scRNAseq #ComputationalBiology
1
5
23
2,359
The performative anxiety around Ravana being a Shiva Bhakta or having any virtues at all is the perfect illustration of what happens when the social becomes the dominant compass over spiritual knowledge core. The Hindu mind abhors easy binarization because Dharma is sukshma and human nature is volatile. It is to help us understand the essential human condition that the empathy of our rishis gave us our Itihasa and Purana - To make the eternal essence of the Vedic corpus relatable to everyone. The spiritual principle explained through Ravana is that one might have great qualities but when they fail Dharma, they devolve. This is all the more critical for those who are extraordinary cause their fall is the swiftest and gravest. Tradition is very clear on the extraordinary attributes of Ravana as it is on his fall. There is neither glorification nor obfuscation. If you wish to understand the real lesson on Ravana that comes to us from Valmiki Ramayana, please go through this beautiful thread by @jamvasu garu x.com/jamvasu/status/2037223… Sample this excerpt from the thread: अहो रूपमहो धैर्यमहो सत्त्वमहो द्युतिः। अहो राक्षसराजस्य सर्वलक्षणयुक्तता।। "What form! What courage! What strength! What radiance! This king of rākṣasas is endowed with every great quality." Hanumān is not praising an enemy. He is bearing witness to a truth: Rāvaṇa's greatness was real. rūpa. dhairya. sattva. dyuti. Sarvalakṣaṇayuktatā — the fullness of all great qualities. By any measure, he was extraordinary". We can only hope that we don't cancel Hanuman ji in our social stupor since he praises Ravana on certain attributes. "The ability to hold opposing ideas and be able to function is the hallmark of a first rate intelligence" - Scott Fitzgerald. This is what our Shastras equip us to do, as long we remain humble and not make them an instrument to serve our agenda.

1/ When Greatness Has No Foundation Rāvaṇa had everything. Form. Courage. Strength. Radiance. Scholarship. Valour. And yet he fell. A thread from the Sundarakāṇḍa on Dharma 🧵
2
70
246
18,549
Lngram: Latent N-Gram Memory for All-Modal LLMs LLMs have a fatal architectural inefficiency 🚨 Insights from Zhihu Contributor 我是猫👇 We force the same Transformer stack to handle two entirely different tasks: 🧠 Hard dynamic work: Multi-step reasoning & long-range dependencies 📌 Trivial static work: Fixed phrases, entities & domain pattern matching Transformers lack native lookup logic. They waste precious layer depth endlessly re-learning basic static patterns — instead of focusing on real reasoning 💥 🔹 Engram → Lngram: Fixing the Core Flaw Engram added memory branches to offload pattern matching — but it’s trapped by tokenizer IDs ❌ It builds N-gram keys from tokenizer IDs ❌ This creates 3 critical bottlenecks: 1️⃣ Token segmentation rarely aligns with true semantic units 2️⃣ Token N-grams explode combinatorially, causing hash collisions & information loss 3️⃣ Text-only limitation — incompatible with vision, robotics & VLA tasks Lngram redefines the paradigm: N-gram lookup moves from rigid token space to flexiblelatent hidden space ✨ 🔹 How Lngram Works (4 Simple Steps) A lightweight residual plug-and-play branch for Transformers (no backbone rewrite needed): 1️⃣ Hidden State Discretization Compresses continuous hidden states into compact 4-bit latent symbols, preserving all task-essential information 2️⃣ Latent N-Gram Key Generation Builds 2/3-gram keys for exact, fast table lookup — no attention, no approximate search overhead 3️⃣ Context-Aware Gating Dynamically weights retrieved memory vectors via current context, avoiding rigid static pattern misuse 4️⃣ Residual Fusion Injects optimized memory features back into the backbone, leaving native attention/MLP fully functional 🔹 The Key Difference: Engram vs Lngram 🟠 Engram: Token-bound, text-only, fixed hash memory (tokenizer-dependent) 🔵 Lngram: Latent-space, full multimodal, learnable routing memory Not RAG. Not attention replacement. Pure, in-model trainable memory optimization 🧠 🔹 Training Innovation: Counterfactual Surrogate Gradient Hard discretization breaks raw gradients — naive STE training fails entirely ❌ Lngram solves this with counterfactual surrogate gradients: It calculates precise local gradients by contrasting embedding outputs of 0/1 bit flips, aligning discrete routing with loss optimization Result: Rock-solid training stability & ultra-precise latent symbol learning ✅ 🔹 Key Proof: Binarized Latent States Are Powerful Qwen3-1.7B Window Attention validation: ✅ Preserves 99% of general language capability ✅ NIAH-32k long-context score: 6.2 → 100 (on par with global attention!) Binarization filters noise, retains core pattern-matching signals 📊 🔹 SOTA Benchmark Performance (All Scales) 🏆 2B MoE Pre-Training Lngram outperforms vanilla MoE ( 1.41%) & Engram ( 0.62%) under identical parameter budgets A 23-layer Lngram model beats a 24-layer vanilla MoE — it directly boosts effective model depth 💪 🏆 Large Model & Long-Train Robustness Gains hold steady at 140B training tokens (2B) & full 8B model scale — no small-model bias 🏆 64k Long-Context Modeling Consistently lower perplexity across all context ranges, freeing attention for true long-range dependency modeling 🏆Domain Adaptation (Zero Extra Overhead) Freeze pretrained backbone, only train Lngram: Driving benchmark accuracy jumps from 50.59 → 55.73% Joint fine-tune pushes it to 62.45% (far surpasses full vanilla fine-tuning) 🔹 True Cross-Modality Power 🎥🤖 No tokenizer dependency = works for all modalities: 🖼️ VLM (LLaVA setup): Average benchmark score 0.7%, big gains in visual reasoning ( 2.1%) & object localization 🤖 VLA Robotics: Improved long-horizon & spatial task success rates on LIBERO/StarVLA 🔹 Internal Mechanism: Boost Effective Model Depth LogitLens & CKA analysis confirm: ✅ Lngram pushes task-related information to earlier Transformer layers ✅ Creates an effective depth gain of 2–3 layers ✅ Separates static lookup tasks from dynamic reasoning tasks (perfect computational division of labor) 🔹 Deployment Efficiency 🚀 ✅ Prefill speed 16.7%, latency -14.3% (faster long-sequence parallel processing) ✅ Decode latency slightly higher ( 6.7%) butzero incremental cache growth ✅ No exploding memory overhead with longer context lengths 🔹 Limitations (No Hype, Full Transparency) ❌ Not a replacement for Transformer reasoning ❌ Slight general capability drop after heavy domain adaptation ❌ Less effective for dynamic cross-instance relational modeling 💡 Core Industry Insight Future LLM progress isn’t just about scaling parameters — it’s about specialized computational division. Let attention handle long-range context, MLP handle complex reasoning, andLngram handle static local pattern lookup. This is the next evolution of efficient Transformer design 🔥 #AIResearch #Lngram #Engram #LLMArchitecture #DeepLearning #AIEfficiency #LLMTraining #VLM #VLA 🔗Full article: zhuanlan.zhihu.com/p/2042559…
1
9
57
2,996
Predicting knockout-induced transcriptomic responses from unperturbed single-cell RNA-seq IGNITE infers a gene regulatory network (GRN) directly from unperturbed scRNA-seq data, then uses that GRN to simulate in silico gene knockouts. From pseudotime-ordered single-cell expression dynamics, this model estimates a directed, weighted, and signed effective GRN and generates wild-type and KO cell-state patterns. 1) Data used for machine learning mouse PSCs: 10x scRNA-seq during naïve-to-formative transition 9894 cells across 0, 6, 12, 24, and 48 h model input: curated 24-gene PSC transition program human PSCs: independent differentiation dataset toward definitive endoderm 758 cells across 0, 12, 24, 36, 72, and 96 h model input: 98 transcriptional regulators 2) From scRNA-seq to gene activity scRNA-seq counts ->-> QC and log-normalization ->-> dimensionality reduction / clustering ->-> Slingshot pseudotime ordering ->-> Mini-Bulk smoothing along pseudotime ->-> binarization by gene-specific half-maximum expression ->-> gene activity 3) Kinetic Ising model as a generative GRN - genes are treated as ON/OFF activity states - the inferred GRN defines activating or repressing effects between genes - gene states are updated probabilistically by Glauber dynamics - repeated updates generate wild-type-like cell-state patterns 4) GRN inference as an inverse problem observed gene activity transitions ->-> infer signed gene–gene effects that can reproduce them ->-> generate WT-like cells from 250 candidate GRNs ->-> select the GRN that best preserves the input correlation structure 5) Mouse KO prediction Model condition: - unperturbed mouse PSC scRNA-seq - curated 24-gene transition program - KO = target-gene interaction removal from the inferred GRN Spearman correlation between predicted and experimental KO–WT gene-activity changes: Rbpj KO: ρ = 0.531 Etv5 KO: ρ = 0.803 Tcf7l1 KO: ρ = 0.010 triple KO: ρ = 0.716 My scientific thought: For KO prediction, the model focused on 24 genes known to be involved in the naïve-to-formative transition of mouse PSCs. Because the correlation was calculated within this restricted and biologically important gene set, the reported values indicate relatively high predictive accuracy. This may also avoid the problem that correlations based on whole-RNA-seq data can appear high simply because of the large number of genes, although I would still be interested in seeing the whole-gene correlation. Gene-wise KO-WT change correlation is useful as a measure of molecular validity of the perturbation response, whereas cell-state prediction is useful for asking whether that response has biological phenotypic meaning. #Bioinformatics #ComputationalBiology
1
3
4
895
one of the coolest projects ive ever done for my native studies cert. was on the differences between different indigenous concepts of gender identity and how the binarization is a product of colonial understanding.
1
10
241
Replying to @junebug1408_
a fuckkk ton of kirby characters are canonically nonbinary yet the binarization of nonbinary ppl continues to ensue
1
2
121
People put too much emphasis on an arbitrary percentage derived from the binarization of said collection, yes. Thanks for agreeing.
1
1
196
ワールドミラー(binarizationモード)とVRC プリント(通常カラー)とパーソナルミラー(かなり透かして、binarizationと通常カラーの写真を重ねる)を組み合わせて撮影📸 違う組み合わせで試したら独特な雰囲気になった一枚 #VRC鏡遊び
1
50
1,689
1/6 🧵 Data preprocessing can make or break your ML model. Two techniques that often get confused are Binning and Binarization. They both transform numerical data, but their goals are very different. Let’s break them down. 👇 #MachineLearning #DataScience #Python
1
3
33
[ALTの詳細] 同じ構図の写真3枚(デフォカメラの設定→スケッチとモザイクフィルター ワールドミラー(binarizationモード)で撮ったVRC プリントをパーソナルミラーをかなり透かして重ね撮りした一枚📸 ※合成なし(VRC内で写真は完成させてます)、3枚のうち1枚だけアバターを入れて撮ってます
『私がいないクリスマス』 #VRChatPhotography #VRC鏡遊び
14
426
Happy Metric System Day ! 📏 📐⚖️🧪 (April 7) Automated measurement of Nanofibers Diameter of nanofibers were automatically measured using low accelerating voltage SEM images without charging. Each fiber is recognized by image binarization and its length, angle, diameter, average values, standard deviation and minimum and maximum values were automatically reported as well as their distribution. To know more about the measurement tools, check out the following post !
6
277
🚀 Excited to announce that Phase 1 of the Entreacte OCR project is officially launching this Monday! 🔄 The core innovation? Reversed binarization — turning inter-character spaces into primary computational objects. Voids become data. 🐍 Built in Python with OpenCV, scikit-image, and NumPy. First phase focuses on binarization, skeletonization, and interval extraction. 🧠 This is a patent-backed research project bridging philosophy of computing and machine-centered vision. 🏢 Proudly developed under Hebdómada, Unipessoal Lda — bringing novel computer vision to life, one interval at a time. 🔜 Phases 2 and 3 will expand toward modular experimentation, OCR integration, and multi-script support. 📄 Paper upcoming. Conference in June. More to come. #OCR #ComputerVision #Entreacte #ReversedBinarization #DeepLearning #OpenCV #Python #AI #Hebdomada #Research #Patent
2
36