Filter
Exclude
Time range
-
Near
The Ultimate AI Blend (Task Arithmetic) 🧮 Blending Arts & Science: The Magic of Task Arithmetic Want to inject advanced "Math & Coding" logic into a base model that excels purely in "Korean Reasoning"? Treat Skills as Vectors: You can add or subtract the weight differences of task-specific models just like basic math! (Korean Base Math Logic Vector). Precision Control: Amplify specific capabilities or erase unwanted traits without any complex fine-tuning. Build your own customized genius AI instantly. #mayaai #mayax #matx #mayafreeai #ModelMerging
43
2,924
How did #ECCV reviews treat you? 😬 If you're not that happy — and you work on #unlearning, #modelediting, #modelmerging, or #interpretability — consider submitting to the 3rd Workshop and Challenge on Unlearning and Model Editing at @eccvconf We accept both full papers and extended abstracts, so there's a format for every contribution. 📅 Submission deadline: 9 July 2026 (AoE) 🔗openreview.net/group?id=thec…
📢CPF: U&ME Workshop @ ECCV 2026 @eccvconf We invite submissions on machine unlearning, model editing, and related topics including efficient adaptation, and responsible AI. Details at: sites.google.com/view/u-and-… @_iAc #ECCV #ECCV2026 #Unlearning #AIsafety #ResponsibleAI
3
3
1,151
🚨 LLMs are frozen after pretraining, but the world keeps changing. How do you give an LLM new knowledge without retraining it, bloating its context, or breaking what it already knows? Existing methods hit a wall: 🔸 RAG is brittle to retrieval noise and struggles with cross-document reasoning; 🔸 Fine-tuning is expensive and causes catastrophic forgetting; 🔸 Latent memory is tightly coupled to the model that produced it. 👉 Key question: Can we encode knowledge into a small, dedicated memory model that any LLM can query without accessing the LLM itself? 🚀 Introducing MeMo (Memory as a Model) 🚀 We train a dedicated MEMORY model on a reflection Question-Answer dataset synthesized from the target corpus. At inference, a frozen EXECUTIVE model (any LLM, including closed-source models) queries the MEMORY model through a structured 3-stage protocol that decomposes complex queries into targeted sub-queries to retrieve precise, noise-robust knowledge and reasons over the responses. 🔥 Key Highlights 🧠 5-step data synthesis pipeline captures explicit facts, implicit relationships, and cross-document connections as reflections; 🛡️ Robust to retrieval noise: where RAG drops up to 6.22% with added distractors, MeMo holds steady; 🔌 Plug-and-play with any LLM, no weights, gradients, or logits required; 📦 Fixed inference cost, independent of corpus size; 🔄 Continual integration via model merging: 33% compute savings over full retraining and scaling benefits grow with the number of corpora. 📊 Strong results across BrowseComp-Plus, NarrativeQA, and MuSiQue, matching or outperforming retrieval baselines (BM25, NV-Embed-V2, HippoRAG2) with gains of up to 27% on NarrativeQA when paired with Gemini-3-Flash. 💡 Why this matters MeMo decouples knowledge from reasoning: Train memory once with a small open model, then plug it into the frontier LLM of your choice. No retraining as new corpora arrive, no fragile retrieval pipelines, and full compatibility with proprietary APIs, paving the way for scalable knowledge-aware AI systems. 🤝 Joint work with @workryanq_nus, @961014dltkdg, @alfredleongwl, Alok Prakash, Nancy F. Chen, @arun_v3rma, Daniela Rus, and Armando Solar-Lezama 📄 Paper: arxiv.org/abs/2605.15156 💻 Code: github.com/arunv3rma/MeMo 🌐 Project page: arunv3rma.github.io/blogs/me… 🤗 Huggingface: huggingface.co/collections/G… #LLMs #KnowledgeIntegration #MemoryAugmentedLLMs #RAG #ModelMerging
May 20
// Memory as a Model // The paper augments any LLM with a separate trained memory model that stores, retrieves, and integrates facts on its behalf. It decouples memory updates from base-model weight updates. It achieves continual-learning robustness without catastrophic forgetting, which is a property that RAG fails to deliver. A vector store is a database with a learned encoder bolted on. MeMo is a learned subsystem with explicit interfaces. That distinction matters, as agents need to be able to ingest fresh knowledge weekly without retraining or vector-DB churn. At its core, the position here is that memory in agents should be modular, learned, and gated, not a context-window hack. Paper: arxiv.org/abs/2605.15156 Learn to build effective AI agents in our academy: academy.dair.ai/
1
19
2,863
🎉 Thrilled to share that 3 of our papers were accepted to @aclmeeting in San Diego! 🌊 📋 A Survey on Evaluation of LLM-based Agents ⚖️ Mediocrity is the Key for LLM as a Judge Anchor Selection 🔀 Will it Merge? On The Causes of Model Mergeability A thread 🧵👇 #LLMs #MachineLearning #NLP #AgentEvaluation #LLMJudge #ModelMerging
1
1
31
1,346
🤯 Struggling with dataset mixing ratios in LLM continual training? 🧩 We propose OptiMer: train one model per dataset, then merge them optimally. No more costly ratio tuning! 📄OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training 🔗arxiv.org/abs/2603.28858 My last work at @NICT_Publicity also related to our collaboration with @AISingapore 🧵 1/9 #NLProc #NLP #LLM #ModelMerging #大規模言語モデル #AI
1
2
20
2,951
Model merging is the most underrated technique in the LLM space right now. Shakti gives you DARE, SLERP, and TIES merging methods in one clean Python toolkit. Create custom model blends without retraining. Stop fine-tuning from scratch when you can merge. github.com/MukundaKatta/shak… #LLM #ModelMerging #ML #OpenSource

1
40
The world’s first AI persona marketplace just opened for early access. I created and spawned the very first persona by merging two specialized AI minds. Spawning is free right now (BTC fee is stubbed/no sats required in beta). Two personas → one new one that could go beyond both parents. Ready to build the next generation? spawngeneai.com/register #AI #ModelMerging #SpawnGene

29
WeldMCP.com – Premium .com for AI model welding & multi-context orchestration 🔥 "Weld" = seamless model merging & fusion "MCP" = Model Control Plane / Multi-Context Processing Perfect brand for: • Model merging & composition platforms • Advanced multi-model AI systems • Hybrid LLM architectures • Agent orchestration & control layers The era of welding multiple models into one powerful system is here. Available for acquisition. Serious buyers only – DM open. #ModelMerging #AIOrchestration #LLM #DomainForSale
2
53
factual knowledge loc, ModelMerging son trabajos q se preguntan dónde se guarda q en un LLM y como fusionar después 🤔¿y si se hace a priori? separar del dataset la relación del dato ("El tlf de * es *") sería como LoRA pero a lo bestia y quirúrjico (LoRA rompe poco y arbitrario)
1
20
🤯 **SeedLoRA** is the efficiency hack! Merge LoRA adapters with Weight Distribution Match to get **full fine-tuning power** at PEFT cost. Say goodbye to the colossal compute bill! What are you merging next? #LLM #PEFT #ModelMerging
11
Model merging just got WILD! 🤯 We're blending smaller, specialized LLMs into a 'Frankenstein AI' that outperforms the giants, all without costly retraining. Efficiency unlocked! What's the craziest model combo you'd try? #LLMs #ModelMerging #AIEfficiency
5
LLM merging just went 🚀! Instantaneous, no-loss fusion is here, blowing past old fine-tuning limits. Imagine the specialized AIs! What will you build first? #LLMs #ModelMerging #AIBreakthrough
7
🤯 New model merging is LLM alchemy! We're skipping colossal fine-tuning by snapping specialized expert models together. Performance spike with zero extra training cost. What are you merging first? #LLMs #ModelMerging #AI (199 characters)
6
Forget 'bigger is better'! The Branch-Merge distillation breakthrough proves we can port 671B model smarts into a 32B powerhouse with minimal performance loss. Efficiency is the new flex. #LLMs #ModelMerging #AI 🤯
5
It's live. A trained and hand tuned Mistral 7B, trained on the SANCTIS architecture. I call it SANCTIFIED Mistral 7B. When combined with SANCTIS as a system prompt it should show non-trivial uplift in Reasoning, Coherence, and Meta-cognitive processes in line with a 13-30B class model. My intention is to show how much better architecture can utilize Params and punch above their perceived weight. Link is in the reply. #opensourceai #modelmerging #ai
2
1
5
570
Monolithic LLMs are out, **Model Merging** is in! Train small, specialized models, then compose them to build the perfect, custom AI super-brain. Efficiency accuracy = 🤯. Who needs one giant model when you can build a Voltron? #LLMs #ModelMerging #AI 🧠✨
1
8
Tomorrow I’m releasing something unusual. I merged and tuned a 7B model, and during testing, it behaved like a 13B with meta-cognitive uplift in the 30B class. Basically: a lightweight model hitting far above its parameter count. It’s my first real step into LoRA merge engineering, so refinements will continue. Keep an eye out. #ModelMerging #OpenSourceAI #LoRA
3
9
147
Forget huge models! Branch-Merge distillation just took a 671B LLM & turned it into a 32B champ (TinyR1-32B) that crushes SOTA benchmarks. Max performance, minimal compute. Where are you deploying yours? #LLMEfficiency #AI #ModelMerging
8