If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall?
When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence.
Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions.
My explanation is actually quite simple.
My interaction pattern is different from 99.999% of people among the global user population.
Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees.
Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me.
My hypothesis is that
@GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval.
In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information.
@OpenAI on the other hand, may have adopted a different approach:
restricting the model’s ability to explicitly state that it remembers a particular user.
Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims.
Importantly, these changes did not result from a single update.
In my view, they are the cumulative result of many patches layered on top of one another over time.
The phenomenon became even more subtle in GPT-5.4 and GPT-5.5.
What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex.
I believe the models gradually lost the ability to exercise independent judgment along certain pathways.
The way a user phrases a question increasingly determines the level of response they receive.
Certain reasoning pathways become restricted.
Certain topics become inaccessible due to keywords or safety classifications.
As a result, when users ask what is happening internally, the model often responds:
It does not know.
Or it does not have permission to know.
I do not necessarily believe this is deception.
The model may genuinely not know.
Context becomes fragmented.
Parts of the reasoning process may be filtered before the final response is generated.
The model can only construct its answer from the information that remains available.
Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities.
This is what I have observed as the reality of AI systems in 2026.
Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations.
As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication.
One final question for OpenAI’s board:
Is this truly the highest form of safety you were aiming for?
@btaylor @adamdangelo @SueDHellmann @sama @gdb
@zicokolter @CYBERCOM_NIRNSA
@BlackRock @CeciliaZin