Filter
Exclude
Time range
-
Near
Continuing in our endeavour to regenerate the results of the self-distillation enables continual learning paper with @KickItLikeShika, we now reach a decent level of 64% on the second science task in the original paper. To help everyone so that they don't go through our pain, the check point is open sourced and here: huggingface.co/KickItLikeShiโ€ฆ๐Ÿซก #AI #SelfDistillation #ContinualLearning
6
16
129
8,214
Most teams are treating AI agent telemetry as they would traditional logs and traces. Your AI Agent Telemetry is More Valuable Than You Think! For instance, A Langsmith trace from a @LangChain agent is not just a debugging artifact. When you bind it to the final user outcome, accepted, edited, regenerated, or abandoned, it becomes an execution trajectory. These execution trajectories are the raw material for CL/CD: Continuous Learning / Continuous Deployment. ๐Ÿ”— blog.investperpetual.com/youโ€ฆ ๐Ÿ“‰ Agent Telemetry is not a depreciating asset Agent traces capture how the task was solved: the user objective, retrieved context, reasoning path, tool calls, intermediate decisions, output, and final human feedback. ๐Ÿงช The reward signal is already inside the product Every accept, edit, regenerate, correction, abandonment, or manual override is preference data. Teams are already generating RLHF-style signals through normal product usage. ๐Ÿ” CL/CD replaces frozen agent behavior Traditional software ships static code through CI/CD. Agentic systems need CL/CD to continuously learn from production trajectories and ship improved behaviors without waiting for massive retraining cycles. ๐Ÿ‘จโ€๐Ÿซ Frontier models become teachers, not crutches For repeatable workflows, frontier models do not need to be the permanent execution engine. They can generate high-quality trajectories that smaller specialized models and LoRA adapters internalize over time. ๐Ÿงฑ This is where companies like @trajectorylabs and @primeintellect become interesting These platforms turn agent traces into learnable execution trajectories and ease open model training and infrastructure. Together, they hint at a future in which teams own the learning loop rather than renting intelligence forever. ๐Ÿš€ The defensibility layer is the dataset The moat is not just prompts, tools, or access to frontier APIs. It is proprietary, domain-specific execution trajectories linked to verified human outcomes. Stop treating agent telemetry as logs. Start treating it as training infrastructure. #AIAgents #AgenticAI #LLMOps #ContinualLearning #AIEvals #OpenSourceAI #AIInfrastructure
2
2
97
๐Ÿ“„ Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories โœจ Introduces a โ€œSleepโ€ paradigm for LLMs, combining memory consolidation with self-generated โ€œdreamingโ€ to support continual learning and reduce forgetting. ๐Ÿ”— Paper: arxiv.org/abs/2606.03979 ๐Ÿ–ผ๏ธ Infographic comparison: โ‘  Gemini-generated โ‘ก ChatGPT-generated #AIResearch #LLMs #ContinualLearning
3
108
Memory is not the moat. Behavior change is. Most "AI memory" products stop at retrieval. They store conversations, embed them, surface the relevant chunks. The agent remembers what you said. It still makes the same mistake tomorrow. That is not learning. That is a filing cabinet with semantic search. At @MidbrainAI , we are building the layer that turns experience into persistent behavior change. Six steps, one loop: 1. Experience - conversations, actions, sensors, feedback 2. AI Agent - perceives and acts 3. SmartSearch - deterministic recall, NER multi-hop ColBERT reranking 4. Persistent Experience Layer - episodic, semantic, and procedural memory unified 5. Learning Engine - consolidation and online learning that updates the model in real time 6. Behavioral Adaptation - faster, personalized, proactive, trusted The continual learning loop closes the gap that every current memory product punts on: experience โ†’ recall โ†’ learn โ†’ adapt โ†’ improve. One companion across every embodiment. Phone, laptop, robot, car, smart home, XR, games. Same identity, same understanding, everywhere. Personal Brain for individuals. Company Brain for institutional knowledge. All of it client-owned. End-to-end encrypted. Zero knowledge server. Your data, your keys. We are not building a better filing cabinet. We are building the brain that fills it. midbrain.ai #ContinualLearning #AIAgents #Memory #LLM
3
3
129
May 22
17 years in tech. Not a single day where I felt like I had โ€œfigured it all out.โ€ And honestly? Thatโ€™s the best part. Yesterday I received the Gold Guru recognition at @TCS for talent development, and it made me pause and reflect, not on the award itself, but on the chain of people who shaped me along the way. Every architect I know started as someone who didnโ€™t know what an API was. Every mentor was once the one asking the โ€œdumbโ€ questions. Every leader was once the nervous fresher on day one. The cycle only works if you keep two things alive: โ†’ The curiosity to keep learning even when youโ€™re โ€œsenior enoughโ€ to stop. โ†’ The willingness to pass it forward, especially when nobodyโ€™s watching. A few things Iโ€™ve learned in 17 years that I wish someone told me in Year 1: ๐Ÿ. ๐˜๐จ๐ฎ๐ซ ๐ญ๐ž๐œ๐ก ๐ฌ๐ญ๐š๐œ๐ค ๐ฐ๐ข๐ฅ๐ฅ ๐œ๐ก๐š๐ง๐ ๐ž. ๐˜๐จ๐ฎ๐ซ ๐ฅ๐ž๐š๐ซ๐ง๐ข๐ง๐  ๐ก๐š๐›๐ข๐ญ ๐ฌ๐ก๐จ๐ฎ๐ฅ๐๐งโ€™๐ญ. I went from building Java web and CMS platforms to architecting AI solutions. The only constant was staying uncomfortable. ๐Ÿ. ๐“๐ž๐š๐œ๐ก๐ข๐ง๐  ๐ข๐ฌ๐งโ€™๐ญ ๐š ๐๐ž๐ญ๐จ๐ฎ๐ซ. ๐ˆ๐ญโ€™๐ฌ ๐š ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฉ๐ฅ๐ข๐ž๐ซ. The moment you explain something to someone else, you understand it three levels deeper yourself. ๐Ÿ‘. ๐๐ฎ๐ข๐ฅ๐ ๐ข๐ง ๐ฉ๐ฎ๐›๐ฅ๐ข๐œ. ๐’๐ก๐š๐ซ๐ž ๐ฐ๐ก๐š๐ญ ๐ฒ๐จ๐ฎ ๐ฅ๐ž๐š๐ซ๐ง. Open source your projects, write about your failures, and let the community hold you accountable. It compounds. ๐Ÿ’. ๐“๐ก๐ž ๐ฃ๐จ๐ฎ๐ซ๐ง๐ž๐ฒ ๐ข๐ฌ ๐ญ๐ก๐ž ๐๐ž๐ฌ๐ญ๐ข๐ง๐š๐ญ๐ข๐จ๐ง. Titles change. Projects end. But the people you lifted along the way, thatโ€™s the real rรฉsumรฉ. To every early-career professional reading this: You donโ€™t need to have all the answers. You just need to keep showing up, keep asking, and keep building. The tech industry moves fast. But the ones who last arenโ€™t the fastest, theyโ€™re the ones who never stopped being students. Grateful for the incredible culture at TCS that celebrates learning and mentorship at this scale. And even more grateful for every person who once took a chance on guiding me. Hereโ€™s to staying curious. ๐Ÿ™Œ #ContinualLearning #TalentDevelopment #Mentorship #TCS #OpenSource #TechLeadership #BuildInPublic #ArchitectLife
6
9
891
How do we make LLMs learn continuously from interactions with users without drowning in noise? Introducing UNO (User log-driveN Optimization), a new framework to continually improve LLM systems using user logs! ๐Ÿš€ ๐Ÿ“ Paper: arxiv.org/abs/2602.06470 ๐Ÿ’ป Code: github.com/bebr2/UNO ๐Ÿ” The Problem: Model scaling has limits. Real-world user logs offer a goldmine of human feedback, but they are unstructured and noisy. Vanilla LLMs struggle with the "Signal-or-Noise Dilemma" and off-policy optimization risks when trying to learn from them. ๐Ÿ’ก The Solution: UNO is a unified framework that learns from logs without modifying the base model's weights: 1๏ธโƒฃ Distills messy logs into semi-structured rules & preference pairs. 2๏ธโƒฃ Clusters them to manage data heterogeneity. 3๏ธโƒฃ Quantifies the "cognitive gap" between the model's prior knowledge and the log data to adaptively filter out noise. 4๏ธโƒฃ Constructs specific modules (adapters) for "primary" (direct generation) and "reflective" (critique/refinement) experiences. ๐Ÿ† The Results: SOTA performance on #MemoryBench and #WildFB. ๐Ÿ‘ Work together with Changyue Wang, Weihang Su, and Yiqun Liu #LLMs #MachineLearning #AI #ContinualLearning #NLP #Research
3
237
๐Ÿ“™Learning, Fast and Slow: LLM Fine-Tuning and Plastic Continual Learning with GEPA. New paper explores how combining slow parameter updates with fast evolving prompts (via GEPA) enables more efficient fine-tuning and true plastic continual learning by @LakshyAAAgrawal @matei_zaharia and amazing team of researchers. ๐Ÿ“ƒ Read Paper: arxiv.org/abs/2605.12484 โšก๏ธPublished a practical breakdown earlier. Read here: super-agentic.ai/resources/sโ€ฆ #LLM #FineTuning #GEPA #ContinualLearning
1
3
7
681
๐Ÿค” I went to ICLR with a question I had for months: if I were designing a continual learning system today, would I put new knowledge in the weights or in the context? Almost everyone I asked answered "context." That's a dismissive answer! I have spent years working on in-weight methods, and I do not think gradient-based consolidation is dead, just badly matched to what practitioners in industry actually want from continual learning, which is high-fidelity recall of past interactions. Fortunately, a position paper from a 24-author Dagstuhl group landed in my feed and argued, more carefully than I had been managing on my own, that the right answer is neither. In-context learning is for fast adaptation and lossless recall. In-weight learning is for slow consolidation of skill. The real research problem is the modular memory between them, deciding what gets promoted from context into the weights. Hopefully the community will now ask less about "ICL or IWL" and more about "what is the right promotion policy, and on what evidence." ๐Ÿ“„ Modular Memory is the Key to Continual Learning Agents #ContinualLearning #ICLR2026 #MachineLearning #FoundationModels
6
11
162
17,562
๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—”๐—ฐ๐—ฐ๐—ฒ๐—ฝ๐˜๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—”๐—ป๐—ป๐—ผ๐˜‚๐—ป๐—ฐ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐ŸŽ‰ Paper titled "Transitioning Heads Conundrum: The Hidden Bottleneck in Long-Tailed Class-Incremental Learning" has been accepted at TMLR 2026 (Transactions on Machine Learning Research). Authors: Rahul Vigneswaran, Hari Chandana Kuchibhotla, Vineeth N Balasubramanian ๐Ÿ‘ Congratulations to all the authors! ๐Ÿ” Key Highlight: This work introduces DEREK (DEcoupling Representations for Early Knowledge Distillation), a method addressing a previously overlooked challenge in Long-Tailed Class-Incremental Learning (LTCIL): the Transitioning Heads Conundrum. In LTCIL, head classes that are well-represented in earlier tasks become tail classes in subsequent tasks due to memory constraints, leading to accelerated catastrophic forgetting. DEREK mitigates this by decoupling head and tail learning via specialized expert networks and applying Early Knowledge Distillation before data constraints take effect, preserving rich representations. Across 2 LTCIL benchmarks, 12 experimental settings, and 24 baselines, DEREK consistently establishes new state-of-the-art performance. #MachineLearning #ContinualLearning #LongTailedLearning #KnowledgeDistillation #TMLR2026 #IITHyderabad
1
2
5
487
The shift from "external scaffolding" to "internalized intelligence" is becoming the next frontier for AI agents. This informative deep dive from @MaikaThoughts and @BornsteinMatt explores why in-context learning (while powerful) might hit a ceiling. True discovery and knowledge require models that can compress experience directly into their parameters. Moving beyond the "sticky notes" phase of LLMs toward systems that actually learn from deployment is a critical architectural evolution. Read more: a16z.com/why-we-need-continuโ€ฆ #AI #MachineLearning #ContinualLearning #LLMs #SoftwareArchitecture #AgenticWorkflows
3
53
๐Ÿ“ข Call for papers: Continual RL Workshop @ RLC 2026, Montreal ๐Ÿ—“๏ธ Submission deadline: May 22, 2026 (AoE) ๐Ÿ”— Website & CFP: sites.google.com/view/continโ€ฆ #ReinforcementLearning #ContinualLearning #MachineLearning #RLC2026 #ContinualRL
8
38
4,022
๐ŸŽ‰ Excited to share that our paper "MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems" has been accepted as a โœจSPOTLIGHTโœจ (top 2.2%) paper at #ICML2026! MemoryBench is the first benchmark to test whether LLMsys is capable of continuely improving itself with user feedback in service time. It covers multiple domains, languages, and types of tasks to evaluate the #ContinualLearning abilities of LLMsys, with a particularly focus on, not just #DeclarativeMemory (e.g., facts in long context), but also #ProceduralMemory (e.g., experience learned from task practice). All the code and data are open-sourced. You can easily try or implement your own methods on MemoryBench. Also, we built a frontend interface so that you can run experiments easily even without a GPU! Feel free to try! ๐Ÿ“„ arXiv: arxiv.org/abs/2510.17281 ๐Ÿ’ป Code: github.com/LittleDinoC/Memorโ€ฆ ๐Ÿ“Š Dataset (Small): huggingface.co/datasets/THUIโ€ฆ ๐Ÿ—„๏ธ Dataset (Full): huggingface.co/datasets/THUIโ€ฆ
1
21
127
7,890
Replying to @I_Am_The_ICT
yes agree, i always have trouble with my stop loss placement. fighting the urge to move too quickly as well. strangling my trade as you would say. #continuallearning
1
2
237
In the era of continued pretraining and continued fine-tuning, loss of plasticity means leaving future gains on the table. We need a better theoretical understanding of loss of plasticity. See a great thread unpacking the dynamics. ๐Ÿ‘‡ #ICLR2026 #ContinualLearning #DeepLearning
Neural nets donโ€™t just forget. Sometimes, after long training, they lose the ability to learn at all. In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: ๐Ÿ”ด frozen units, ๐Ÿ”ต cloned units. The video makes the traps visible.
4
12
2,117
Giulia Lanzillotta is presenting this today at #ICLR2026: ๐Ÿ“ Poster Session 6, Pavilion 4, Board #4202 ๐Ÿ•’ Sat Apr 25, 3:15โ€“5:45 PM local time Paper/code/demo in replies. #ContinualLearning #DeepLearning #LearningTheory
1
8
1,456