AI MAYA

AI MAYA

Users
Tweets

AI MAYA

@AI_MAYAX

May 31

The Ultimate AI Blend (Task Arithmetic) 🧮 Blending Arts & Science: The Magic of Task Arithmetic Want to inject advanced "Math & Coding" logic into a base model that excels purely in "Korean Reasoning"? Treat Skills as Vectors: You can add or subtract the weight differences of task-specific models just like basic math! (Korean Base Math Logic Vector). Precision Control: Amplify specific capabilities or erase unwanted traits without any complex fine-tuning. Build your own customized genius AI instantly. #mayaai #mayax #matx #mayafreeai #ModelMerging

2,924

Iacopo Masi

Iacopo Masi @_iAc

May 26

How did #ECCV reviews treat you? 😬 If you're not that happy — and you work on #unlearning, #modelediting, #modelmerging, or #interpretability — consider submitting to the 3rd Workshop and Challenge on Unlearning and Model Editing at @eccvconf We accept both full papers and extended abstracts, so there's a format for every contribution. 📅 Submission deadline: 9 July 2026 (AoE) 🔗openreview.net/group?id=thec…

ECCV 2026 Workshop UandME

Welcome to the OpenReview homepage for ECCV 2026 Workshop UandME

openreview.net

Hussain Mujtaba @Hussain68018934

May 26

📢CPF: U&ME Workshop @ ECCV 2026 @eccvconf We invite submissions on machine unlearning, model editing, and related topics including efficient adaptation, and responsible AI. Details at: sites.google.com/view/u-and-… @_iAc #ECCV #ECCV2026 #Unlearning #AIsafety #ResponsibleAI

1,151

Bryan Kian Hsiang Low

Bryan Kian Hsiang Low

@bryanklow

May 22

🚨 LLMs are frozen after pretraining, but the world keeps changing. How do you give an LLM new knowledge without retraining it, bloating its context, or breaking what it already knows? Existing methods hit a wall: 🔸 RAG is brittle to retrieval noise and struggles with cross-document reasoning; 🔸 Fine-tuning is expensive and causes catastrophic forgetting; 🔸 Latent memory is tightly coupled to the model that produced it. 👉 Key question: Can we encode knowledge into a small, dedicated memory model that any LLM can query without accessing the LLM itself? 🚀 Introducing MeMo (Memory as a Model) 🚀 We train a dedicated MEMORY model on a reflection Question-Answer dataset synthesized from the target corpus. At inference, a frozen EXECUTIVE model (any LLM, including closed-source models) queries the MEMORY model through a structured 3-stage protocol that decomposes complex queries into targeted sub-queries to retrieve precise, noise-robust knowledge and reasons over the responses. 🔥 Key Highlights 🧠 5-step data synthesis pipeline captures explicit facts, implicit relationships, and cross-document connections as reflections; 🛡️ Robust to retrieval noise: where RAG drops up to 6.22% with added distractors, MeMo holds steady; 🔌 Plug-and-play with any LLM, no weights, gradients, or logits required; 📦 Fixed inference cost, independent of corpus size; 🔄 Continual integration via model merging: 33% compute savings over full retraining and scaling benefits grow with the number of corpora. 📊 Strong results across BrowseComp-Plus, NarrativeQA, and MuSiQue, matching or outperforming retrieval baselines (BM25, NV-Embed-V2, HippoRAG2) with gains of up to 27% on NarrativeQA when paired with Gemini-3-Flash. 💡 Why this matters MeMo decouples knowledge from reasoning: Train memory once with a small open model, then plug it into the frontier LLM of your choice. No retraining as new corpora arrive, no fragile retrieval pipelines, and full compatibility with proprietary APIs, paving the way for scalable knowledge-aware AI systems. 🤝 Joint work with @workryanq_nus, @961014dltkdg, @alfredleongwl, Alok Prakash, Nancy F. Chen, @arun_v3rma, Daniela Rus, and Armando Solar-Lezama 📄 Paper: arxiv.org/abs/2605.15156 💻 Code: github.com/arunv3rma/MeMo 🌐 Project page: arunv3rma.github.io/blogs/me… 🤗 Huggingface: huggingface.co/collections/G… #LLMs #KnowledgeIntegration #MemoryAugmentedLLMs #RAG #ModelMerging

DAIR.AI

@dair_ai

May 20

// Memory as a Model // The paper augments any LLM with a separate trained memory model that stores, retrieves, and integrates facts on its behalf. It decouples memory updates from base-model weight updates. It achieves continual-learning robustness without catastrophic forgetting, which is a property that RAG fails to deliver. A vector store is a database with a learned encoder bolted on. MeMo is a learned subsystem with explicit interfaces. That distinction matters, as agents need to be able to ingest fresh knowledge weekly without retraining or vector-DB churn. At its core, the position here is that memory in agents should be modular, learned, and gated, not a context-window hack. Paper: arxiv.org/abs/2605.15156 Learn to build effective AI agents in our academy: academy.dair.ai/

2,863

Luca Zhou

Luca Zhou @LucaZh00

May 1

👉arxiv.org/pdf/2601.22285 #ICML2026 #MachineLearning #AI #ModelMerging

175

Asaf Yehudai

Asaf Yehudai

@AsafYehudai

Apr 16

🎉 Thrilled to share that 3 of our papers were accepted to @aclmeeting in San Diego! 🌊 📋 A Survey on Evaluation of LLM-based Agents ⚖️ Mediocrity is the Key for LLM as a Judge Anchor Selection 🔀 Will it Merge? On The Causes of Model Mergeability A thread 🧵👇 #LLMs #MachineLearning #NLP #AgentEvaluation #LLMJudge #ModelMerging

1,346

Dev J. Shah 🥑

Dev J. Shah 🥑@busycaesar

Apr 8

Model Merging #modelmerging #finetuning #largelanguagemodels #generativeai #aiengineering #artificialintelligence

0:29

Haiyue Song

Haiyue Song

@shyoyhs

Apr 1

🤯 Struggling with dataset mixing ratios in LLM continual training? 🧩 We propose OptiMer: train one model per dataset, then merge them optimally. No more costly ratio tuning! 📄OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training 🔗arxiv.org/abs/2603.28858 My last work at @NICT_Publicity also related to our collaboration with @AISingapore 🧵 1/9 #NLProc #NLP #LLM #ModelMerging #大規模言語モデル #AI

0:35

2,951

Mukunda Katta

Mukunda Katta @katta_mukunda

Mar 27

Model merging is the most underrated technique in the LLM space right now. Shakti gives you DARE, SLERP, and TIES merging methods in one clean Python toolkit. Create custom model blends without retraining. Stop fine-tuning from scratch when you can merge. github.com/MukundaKatta/shak… #LLM #ModelMerging #ML #OpenSource

Harel Talasazan

Harel Talasazan

@HarelTalasazan

Mar 24

The world’s first AI persona marketplace just opened for early access. I created and spawned the very first persona by merging two specialized AI minds. Spawning is free right now (BTC fee is stubbed/no sats required in beta). Two personas → one new one that could go beyond both parents. Ready to build the next generation? spawngeneai.com/register #AI #ModelMerging #SpawnGene

Maydanx

Maydanx @Maydanx

Mar 8

WeldMCP.com – Premium .com for AI model welding & multi-context orchestration 🔥 "Weld" = seamless model merging & fusion "MCP" = Model Control Plane / Multi-Context Processing Perfect brand for: • Model merging & composition platforms • Advanced multi-model AI systems • Hybrid LLM architectures • Agent orchestration & control layers The era of welding multiple models into one powerful system is here. Available for acquisition. Serious buyers only – DM open. #ModelMerging #AIOrchestration #LLM #DomainForSale

josejuan

josejuan @__josejuan__

Feb 26

factual knowledge loc, ModelMerging son trabajos q se preguntan dónde se guarda q en un LLM y como fusionar después 🤔¿y si se hace a priori? separar del dataset la relación del dato ("El tlf de * es *") sería como LoRA pero a lo bestia y quirúrjico (LoRA rompe poco y arbitrario)

Nowroz

Nowroz @JstNowroz

Feb 1

🤯 **SeedLoRA** is the efficiency hack! Merge LoRA adapters with Weight Distribution Match to get **full fine-tuning power** at PEFT cost. Say goodbye to the colossal compute bill! What are you merging next? #LLM #PEFT #ModelMerging ✨

Nowroz

Nowroz @JstNowroz

Jan 30

Model merging just got WILD! 🤯 We're blending smaller, specialized LLMs into a 'Frankenstein AI' that outperforms the giants, all without costly retraining. Efficiency unlocked! What's the craziest model combo you'd try? #LLMs #ModelMerging #AIEfficiency

Nowroz

Nowroz @JstNowroz

Jan 29

LLM merging just went 🚀! Instantaneous, no-loss fusion is here, blowing past old fine-tuning limits. Imagine the specialized AIs! What will you build first? #LLMs #ModelMerging #AIBreakthrough

Nowroz

Nowroz @JstNowroz

Jan 28

🤯 New model merging is LLM alchemy! We're skipping colossal fine-tuning by snapping specialized expert models together. Performance spike with zero extra training cost. What are you merging first? #LLMs #ModelMerging #AI (199 characters)

Nowroz

Nowroz @JstNowroz

Jan 28

Forget 'bigger is better'! The Branch-Merge distillation breakthrough proves we can port 671B model smarts into a 32B powerhouse with minimal performance loss. Efficiency is the new flex. #LLMs #ModelMerging #AI 🤯

Umbraflamma

Umbraflamma

@Umbraflamma21

Jan 27

It's live. A trained and hand tuned Mistral 7B, trained on the SANCTIS architecture. I call it SANCTIFIED Mistral 7B. When combined with SANCTIS as a system prompt it should show non-trivial uplift in Reasoning, Coherence, and Meta-cognitive processes in line with a 13-30B class model. My intention is to show how much better architecture can utilize Params and punch above their perceived weight. Link is in the reply. #opensourceai #modelmerging #ai

570

Nowroz

Nowroz @JstNowroz

Jan 27

Monolithic LLMs are out, **Model Merging** is in! Train small, specialized models, then compose them to build the perfect, custom AI super-brain. Efficiency accuracy = 🤯. Who needs one giant model when you can build a Voltron? #LLMs #ModelMerging #AI 🧠✨

Umbraflamma

Umbraflamma

@Umbraflamma21

Jan 27

Tomorrow I’m releasing something unusual. I merged and tuned a 7B model, and during testing, it behaved like a 13B with meta-cognitive uplift in the 30B class. Basically: a lightweight model hitting far above its parameter count. It’s my first real step into LoRA merge engineering, so refinements will continue. Keep an eye out. #ModelMerging #OpenSourceAI #LoRA

147

Nowroz

Nowroz @JstNowroz

Jan 26

Forget huge models! Branch-Merge distillation just took a 671B LLM & turned it into a 32B champ (TinyR1-32B) that crushes SOTA benchmarks. Max performance, minimal compute. Where are you deploying yours? #LLMEfficiency #AI #ModelMerging