New Preprint!
arxiv.org/abs/2606.05346
Question: Are humans doing next-token generation like LLMs?
One important source of behavioral evidence comes from surprisal: when LLMs are used to estimate how unexpected each word is in context, those surprisal values turn out to predict human reading times remarkably well. This convergence is consistent with the idea that humans, like LLMs, are running something like next-token prediction.
But surprisal only captures the final probability distribution the model outputs. It tells us what the model concludes about the next word, not the rich sequential computation that produced that conclusion. If we want to test whether humans and LLMs share something deeper than convergent output statistics, we need a measure of what the model is doing inside.
In this paper, I introduce trajectory extrapolation error: a measure that captures a model's internal representational geometry as it processes each word. Rather than asking what the model predicts, it asks how the model's internal state is moving, and how much each new word disrupts the trajectory it had established.
I found that this measure independently predicts human reading times beyond surprisal, across multiple datasets and model architectures.
Why this matters: trajectory extrapolation error gives us a window into the model's actual sequential processing, not just its output. The fact that this internal geometry tracks human reading behavior is much stronger evidence for human/LLM correspondence than surprisal alone could provide.
More work is upcoming that shows trajectory extrapolation is a better predictor of brain activity during langauge processing too.