Excited to present - Human-Like Memory for LLMs:
Our brain interprets its continuous experience by segmenting it into discrete events which are then stored. What is even cooler is that boundaries between those events correspond to surprise – i.e., brain is -surprised by the information it saw – We all remember surprising events don’t we?
Q: Can we enable LLMs to have a human-like memory structure?
A: You bet! This is what my fantastic team just did in our latest paper!
Q: How did we do it?
A: Here’s a short (very short summary):
1. Tokens are not handled jointly, we will chunk them into blocks via a notion of surprise that is defined with respect to the LLM - as our brains do it – the difference is that we define our notion of surprise via log likelihoods (check Section 3) in the paper.
2. While surprise splits are reasonable, we notice that the utility of elements within an event during memory recall depends on their likelihood to be used by the current query.
3. With this realization, we propose refinements of the original splits by reframing the problem form a graph theoretic perspective and proposing to maximize graph-clustering metrics (check Section 3.3)
4. This finally gives us our algorithms that: 1) Chunks tokens based on surprise in an initial step, 2) refines the boundaries by building-on ideas from graph-theory.
Q: Does it work?
A: Marvelously! Results are good, where we compare to infLLM (previous SOTA) and we show an overall relative improvement of 4.3% across various tasks, including a 33% improvement on the PassageRetrieval task.
Q: Is this Human connection just a BS selling pt?
A: Nope! Our analysis reveals strong correlations between EM-LLM’s event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart. This work not only advances LLM capabilities in processing extended contexts but also provides a computational framework for exploring human memory mechanisms, opening new avenues for interdisciplinary research in AI and cognitive science.
Check out the paper:
arxiv.org/pdf/2407.09450 #AI #MachineLearning #GPT4 #LLM