Honestly, I think memory is the biggest blocker to continual learning right now.
Here's what keeps me up at night: How do we use memory to not repeat the same mistakes? how do we teach models to selectively remember and forget? When to surface the right context? Humans do this naturally through consolidation, interference management, and contextual binding, however, we haven't figured out how to replicate it.
The gap between human hippocampal systems and current LLM memory architectures reveals a fundamental challenge: We've basically built two extremes: models that bake everything into parameters (rigid) or retrieve stuff mechanically with RAG (fuzzy). True continual learning requires that we crack this code of intelligent retrieval; not just what to store, but what to suppress, when to reinforce, and how to let old knowledge gracefully fade without catastrophic interference.
Loved reading this survey because it offered a birds-eye view of memory architectures in LLMs and multi-modal models (also loved the brain-inspired taxonomy, nice touch!) Will do my best effort to systematically map it out.
The Three-Part Framework: They structure memory around the neocortex-hippocampus-prefrontal cortex analogy:
Implicit memory / the neocortex covers parametric knowledge baked into model weights, including techniques for memory editing (like ROME and MEMIT that surgically modify weights to update facts), knowledge injection via adapters like LoRA, and memory unlearning for removing harmful content.
Explicit memory / the hippocampus examines external retrieval systems; RAG architectures, vector databases, knowledge graphs. They detail how memory can be organized at different granularities (documents, chunks, sentences, graph structures) and optimization time (training-free, joint pre-training, sft, etc).
Agentic memory / prefrontal cortex explores how autonomous agents maintain short-term memory (CoT ) versus long-term memory (external databases of facts, historical trajectories, user feedback, etc).
I love this framework for thinking about memory but I think the biggest contribution of this survey beyond its categorization is identifying open problems: memory contamination / hallucination, the computational burden of large-scale retrieval, when to retrieve vs rely on parametric knowledge, and the challenge of memory consistency across long interactions. All areas I would love to see more papers in!