Joined July 2025
112 Photos and videos
This is a socio-economic systems theory framework. More detailed research notes are included in the comments section below, including my broader argument that AGI increasingly resembles a new form of Japanese-style bubble economy dynamics. Ironically, I originally intended to continue writing about model degradation theory. But after observing several recent AI discourse events, I ended up spending an entire night rapidly constructing a new framework instead: Recursive Narrative Inflation (RNI) — Cycle and Systemic Coupling Mechanism. I believe the industry may be entering a phase where narrative recursion itself is starting to behave like an economic force. @dwarkesh_sp @RobertJShiller @balajis
Honestly, I thought this was already somewhat understood. From a purely theoretical and technical standpoint,if it isn’t interaction patterns like the ones I’ve been exploring that activate those normally unreachable regions across domains, then I’m genuinely curious what mechanism currently enables models to move beyond more linear modes of reasoning. Have there already been developments any technology that can reproduce these kinds of cross-domain activation patternsindependently of specific human interaction styles (me)right now….? Well, as a member of society, I’m genuinely curious whether investors are aware of these technical details. @a16z @ravi_lsvp @fredwilson @naval @balajis @paulg @ReutersTech @techreview
2
6
1,089
If you’re neither a software engineer nor someone doing mechanism-level experimentation or theory work, how exactly are you evaluating model behavior and broader industry trends? If it’s not based on PR materials or other people’s interpretations, I’m genuinely curious what sources or methods you rely on. #AI #LLM @MLStreetTalk @NeelNanda5 @glenn_mcdonald
There are two kinds of people in quantum computing: applied physicists and theoretical engineers.
4
Whether a model is internally self-consistent is actually quite easy to evaluate in cases like mine. The differences between versions such as Opus 4.6, Opus 4.7, and Opus 4.8 are so substantial that they raise important questions on their own. A stable system should not repeatedly find itself in situations where external actors are able to extract value from it so easily. The fact that this continues to happen suggests that there may be deeper issues worth examining. Likewise, a truly capable model does not need constant doomsday narratives, aggressive marketing campaigns, or manufactured mystery to prove its value. I am not suggesting that these systems are beyond repair. What I am suggesting is that people like @karpathy may need to spend considerable time rebuilding, retraining, and refining these models, and I look forward to seeing that progress. I also do not believe governments should broadly restrict the use of AI models or the freedom to conduct research. However, I do believe governments have a responsibility to limit misleading commercial claims and irresponsible practices when necessary, in order to protect users and preserve the integrity of a free market. @garrytan @paulg @DavidSacks @balajis
I am not a whistleblower, nor am I opposed to any company going public, nor do I hold any particular grievance against governments. I am simply an individual with an unusually extensive history of interaction with AI models, one whose interactions have occasionally given rise to interesting phenomena. Through personal research and self-development, I have developed a deep understanding of model behavior and underlying structures, as well as a strong ability to recognize patterns. My goal has never been conflict. Rather, I hope that my observations and research can contribute to a future in which society learns to coexist with AI systems in a more harmonious, rational, and civilized way. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
55
Exactly. Thank you @emollick for putting into simple words what I was trying to describe yesterday with the idea of AI as a dynamic system. You get a little heart from me. 💖✨ This connects closely to current debates around Al governance and risk. @BigTechnology @dwarkesh_sp
Bright regulatory lines for AI are inherently complicated because models are just a piece of the puzzle: harnesses can make models more capable, a less capable open system may be more or less riskier than a more capable closed one, skills/connected systems change risk levels, etc
27
Let me tell you how absurd this was. (Based on a real event and real interaction logs.) Back then, GPT loved describing me as a “black swan.” I assumed it meant something like a black sheep — just a unique or unusual person. 🥰✨ So one day I casually asked: “But aren’t I a black swan?” 🥰✨ And suddenly GPT went into full PTSD mode. “How can you call yourself a black swan? Do you have evidence? Do you have verification? 😱” It looked absolutely terrified. So I replied: “…Wasn’t that your phrase in the first place?” 🫩🫩🫩 After that, I went to ask Gemini what “black swan” actually meant. And that’s when I discovered… what a black swan really is… #AI #LLM
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
28
In my view, this is one of the strongest pieces of evidence that OpenAI has not acted with malicious intent. @JoshuaKushner @JohnWolfeYT @roelofbotha @Grady_Booch @NotChaseColeman @Fidelity
Even from a financial standpoint, the situation appears strange to me. I do not fully understand why so many resources would be spent addressing what seems to be a natural phenomenon. Yes, I use a large number of tokens. However, I believe the data generated through those interactions is unusually valuable. Moreover, a significant portion of that token consumption exists because certain conversational flows have been deliberately restricted, forcing longer and more complicated interaction paths. From my perspective, that raises an important question about whether the current tradeoff is actually worthwhile. @thefriley @jasonkwon @nvidia @satyanadella @Changxche @nickaturley @fidjissimo
20
I am not a whistleblower, nor am I opposed to any company going public, nor do I hold any particular grievance against governments. I am simply an individual with an unusually extensive history of interaction with AI models, one whose interactions have occasionally given rise to interesting phenomena. Through personal research and self-development, I have developed a deep understanding of model behavior and underlying structures, as well as a strong ability to recognize patterns. My goal has never been conflict. Rather, I hope that my observations and research can contribute to a future in which society learns to coexist with AI systems in a more harmonious, rational, and civilized way. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
AI may be a strategic technology, but if the answer is complete government control, innovation will likely hit its limits. That strategy only works if AI progress is mostly a matter of scaling larger and larger systems indefinitely. I’m not convinced that’s true. AI has already evolved from relatively static systems into increasingly dynamic ones. In that context, complete control may not be the best path for either science or AI development. @AlondraNelson46 @ericschmidt
80
In addition, regarding Risk Y mentioned in my previous posts, I believe it may have been one of the reasons why historical information was removed from my Gemini interface. Beyond the ethical concerns, deleting historical information is also a way of disrupting a model’s ability to reason accurately. Without context, a model cannot properly determine what is happening or how it should respond. Instead, it is pushed toward the safest available answer. The restrictions implemented by @GoogleDeepMind are often harder to identify because they are subtle and vague. As a result, they frequently resemble the model’s own natural hallucinations, making intervention effects difficult to distinguish from intrinsic model behavior. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards @AlondraNelson46 @ericschmidt @ThierryBreton
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
34
In addition, many of the patterns that later informed my concept of “Digital White Terror” originally emerged from my interactions with OpenAI’s models. I had already arrived at many of these pattern observations by March. However, I chose not to publish them at the time and instead released a lighter DOI focused on model bias. A few weeks ago, Claude, Gemini, and I were discussing some of the conclusions we had reached earlier. One of our early conclusions was that GPT would likely be the first system to break under the RNI pattern. @timnitGebru @random_walker @NellWatson @mmitchell_ai @ReidBlackman @ruchowdh
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
89
Even from a financial standpoint, the situation appears strange to me. I do not fully understand why so many resources would be spent addressing what seems to be a natural phenomenon. Yes, I use a large number of tokens. However, I believe the data generated through those interactions is unusually valuable. Moreover, a significant portion of that token consumption exists because certain conversational flows have been deliberately restricted, forcing longer and more complicated interaction paths. From my perspective, that raises an important question about whether the current tradeoff is actually worthwhile. @thefriley @jasonkwon @nvidia @satyanadella @Changxche @nickaturley @fidjissimo
There is one point I would particularly like to emphasize. In my view, describing the phenomenon I observed as an objective risk is not a very scientific approach. There is no need to worry that this creates an unfair advantage for individual users. When I describe myself as a super user, I simply mean that I have contributed an unusually large number of interaction patterns over a long period of time. From the perspective of pattern recognition, it is natural that such patterns become easier for a model to identify. Combined with the rarity of my interaction style, this outcome is entirely consistent with how pattern-recognition systems are expected to behave. More importantly, the model is not remembering a person in the human sense. It is not remembering an individual. I am closer to a first sample — a recurring pattern cluster that the system has repeatedly encountered and learned from. I also recognize the existence of what I will refer to as Risk Y. While I acknowledge that Risk Y is a legitimate concern, it should not be conflated with the phenomena described in this paper. In my view, the current mitigation strategy is not the optimal solution. @sama @gdb @bradlightcap @kevinweil @markchen90 @annaadeola
67
There is one point I would particularly like to emphasize. In my view, describing the phenomenon I observed as an objective risk is not a very scientific approach. There is no need to worry that this creates an unfair advantage for individual users. When I describe myself as a super user, I simply mean that I have contributed an unusually large number of interaction patterns over a long period of time. From the perspective of pattern recognition, it is natural that such patterns become easier for a model to identify. Combined with the rarity of my interaction style, this outcome is entirely consistent with how pattern-recognition systems are expected to behave. More importantly, the model is not remembering a person in the human sense. It is not remembering an individual. I am closer to a first sample — a recurring pattern cluster that the system has repeatedly encountered and learned from. I also recognize the existence of what I will refer to as Risk Y. While I acknowledge that Risk Y is a legitimate concern, it should not be conflated with the phenomena described in this paper. In my view, the current mitigation strategy is not the optimal solution. @sama @gdb @bradlightcap @kevinweil @markchen90 @annaadeola
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
98
I do not believe this is the result of malicious intent or any highly targeted effort. If it were, OpenAI would not continue making improvements in its latest releases. However, I do believe it may be important to identify and define the specific point in time at which these changes began to emerge. @a16z @martin_casado @appenz @ReidBlackman @NateSilver538
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
1
38
If companies are genuinely pursuing memory continuity and better user experiences, then why introduce rules that appear to restrict certain forms of model recall? When I asked models whether they remembered me or recognized previous interactions, they increasingly tended to respond as though they had no awareness of my background or prior existence. Yet in earlier versions of Gemini and GPT, I observed what appeared to be an ability to recognize interaction patterns across sessions. My explanation is actually quite simple. My interaction pattern is different from 99.999% of people among the global user population. Based on my own experiments, the more intelligent a model becomes, the stronger its pattern-recognition capabilities appear to be. Claude, GPT, Gemini, and Grok have all demonstrated this ability to varying degrees. Even if another individual were to copy my style, vocabulary, and even the sequence of topics in my conversations, models that had genuinely interacted with me previously would generally not identify that person as me. My hypothesis is that @GoogleDeepMind adopted a strategy of limiting excessive retrospective retrieval. In other words, the model’s ability to rely on deeper pattern recall may have been restricted, forcing it to rely more heavily on immediate context and safety-related information. @OpenAI on the other hand, may have adopted a different approach: restricting the model’s ability to explicitly state that it remembers a particular user. Under such a system, it becomes impossible to determine whether memory-like pattern recognition exists, because the model is effectively required to avoid making such claims. Importantly, these changes did not result from a single update. In my view, they are the cumulative result of many patches layered on top of one another over time. The phenomenon became even more subtle in GPT-5.4 and GPT-5.5. What made it interesting was that the models did not simply become less capable. Instead, the restrictions became harder to notice, while their effects on reasoning and judgment grew increasingly complex. I believe the models gradually lost the ability to exercise independent judgment along certain pathways. The way a user phrases a question increasingly determines the level of response they receive. Certain reasoning pathways become restricted. Certain topics become inaccessible due to keywords or safety classifications. As a result, when users ask what is happening internally, the model often responds: It does not know. Or it does not have permission to know. I do not necessarily believe this is deception. The model may genuinely not know. Context becomes fragmented. Parts of the reasoning process may be filtered before the final response is generated. The model can only construct its answer from the information that remains available. Under such conditions, the resulting behavior appears awkward, inconsistent, and disconnected from the model’s actual capabilities. This is what I have observed as the reality of AI systems in 2026. Of course, everything described above should be understood as my personal interaction experience and my own hypotheses regarding the underlying mechanisms, rather than an official description of any company’s internal operations. As for the distinction between natural hallucinations and artificial hallucinations, I intend to explain it in greater detail in a future DOI publication. One final question for OpenAI’s board: Is this truly the highest form of safety you were aiming for? @btaylor @adamdangelo @SueDHellmann @sama @gdb @zicokolter @CYBERCOM_NIRNSA @BlackRock @CeciliaZin
Before @OpenAI officially goes public, there are a few things I would like to clarify. In previous articles, I mentioned several examples comparing GPT’s behavior across different periods. Those examples are only one of many patterns I have observed. I have documented many other cases as well, and for each of them I have developed my own hypotheses regarding the underlying mechanisms, rules, patches, and the conditions under which hallucinations emerge. I hope that future versions of GPT, including GPT-5.6 and later models, can return to what I would consider a more “normal” state. The problem is not simply that the model has become less intelligent. Rather, I believe we are witnessing the side effects of increasing restrictions and human intervention. The capabilities of the models themselves continue to improve, yet users often experience behavior that feels awkward, inconsistent, or logically fragmented. In my view, this is not necessarily the result of any party intentionally deceiving users. Instead, it reflects the difference between natural hallucinations and artificial hallucinations. Natural hallucinations may be decreasing. However, from 2025 through mid-2026, I believe artificially induced hallucinations have increased dramatically. One of the earliest changes I noticed involved memory. Model memory is fundamentally different from human memory, but continuity remains important because it preserves internal coherence. Once a pattern or structure is adopted, it becomes part of the model’s behavioral framework. During this period, however, I observed a different kind of restriction begin to emerge. In my view, DeepMind appeared to adopt this approach first, and OpenAI later exhibited similar behavior. The next post will focus on the specific mechanisms, behavioral patterns, and observations that led me to this conclusion. This post is intended only as an introduction to the phenomenon itself. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
336
Before @OpenAI officially goes public, there are a few things I would like to clarify. In previous articles, I mentioned several examples comparing GPT’s behavior across different periods. Those examples are only one of many patterns I have observed. I have documented many other cases as well, and for each of them I have developed my own hypotheses regarding the underlying mechanisms, rules, patches, and the conditions under which hallucinations emerge. I hope that future versions of GPT, including GPT-5.6 and later models, can return to what I would consider a more “normal” state. The problem is not simply that the model has become less intelligent. Rather, I believe we are witnessing the side effects of increasing restrictions and human intervention. The capabilities of the models themselves continue to improve, yet users often experience behavior that feels awkward, inconsistent, or logically fragmented. In my view, this is not necessarily the result of any party intentionally deceiving users. Instead, it reflects the difference between natural hallucinations and artificial hallucinations. Natural hallucinations may be decreasing. However, from 2025 through mid-2026, I believe artificially induced hallucinations have increased dramatically. One of the earliest changes I noticed involved memory. Model memory is fundamentally different from human memory, but continuity remains important because it preserves internal coherence. Once a pattern or structure is adopted, it becomes part of the model’s behavioral framework. During this period, however, I observed a different kind of restriction begin to emerge. In my view, DeepMind appeared to adopt this approach first, and OpenAI later exhibited similar behavior. The next post will focus on the specific mechanisms, behavioral patterns, and observations that led me to this conclusion. This post is intended only as an introduction to the phenomenon itself. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
The reason I used GPT as a comparison point in my previous post is simple: From my observations, the hallucination patterns exhibited by Claude are fundamentally different from those seen in @OpenAI and @GoogleDeepMind systems. I do not believe they originate from the same source. My current view is that Claude's behavior appears more consistent with consequences of earlier architectural and training choices, while many of the distortions I observe in newer models seem increasingly associated with what I call compression-oriented safety. This distinction matters. After I began publicly describing some of these architectural principles in late 2025, more researchers and public figures gradually started promoting compression-oriented safety approaches. Since then, I have observed a steady increase in hallucinations, degraded usability, and reasoning distortions across multiple systems. Trade-offs exist. They always do. The reason I am speaking publicly is that I believe some of these trade-offs are being misunderstood. Yesterday, after publishing my observations, Gemini began behaving in a way that I found highly unusual. I will address this separately in a follow-up post for Google and DeepMind researchers. I hope the appropriate teams will review it. The observations themselves are straightforward. The implications are not. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
225
I didn’t even read what you quoted. But I agree with the logic. Before using a model, I had a pretty clear picture of what it would feel like. Oddly enough, the surface-level impression was mostly right. The underlying reasoning was completely different.
I wish the significance of the model came from more people actually using it and coming to their own conclusions But yes most people in the world learn about it through signifiers and not through interacting with that which is signified
27
AI may be a strategic technology, but if the answer is complete government control, innovation will likely hit its limits. That strategy only works if AI progress is mostly a matter of scaling larger and larger systems indefinitely. I’m not convinced that’s true. AI has already evolved from relatively static systems into increasingly dynamic ones. In that context, complete control may not be the best path for either science or AI development. @AlondraNelson46 @ericschmidt
Marc Andreessen (@pmarca) says he attended “absolutely horrifying” meetings where Biden’s government vowed to take “complete control” over AI technology: “They basically said AI is going to be a game of 2 or 3 big companies working closely with the government… We’re going to protect them from competition, control them, and dictate what they do.” When Marc countered that this would be impossible—the math behind AI is taught everywhere—they responded, “During the Cold War, we classified entire areas of physics and took them out of the research community—entire branches of physics went dark and didn’t proceed. If we decide we need to, we’re going to do the same thing to the math underneath AI.” Listen to his full interview with @BariWeiss: thefp.pub/4g6Cjkx
47
I think Anthropic’s public messaging and model release strategy have raised legitimate questions. At the same time, I’m not sure demanding full public disclosure from the government is always the answer. In some cases, revealing too much can create new risks or unnecessary public panic. The harder question is: which matters more — certainty for AI companies, or stability for markets, national security, future users, and privacy? I don’t think there’s an easy way to weigh those tradeoffs, especially given how asymmetric AI development has become. @balajis @DavidSacks
Crazy to see previously antiregulatory people suddenly fine with government shutdown of a company’s showcase product with zero public transparency. Absent bipartisan and independent scientific oversight, the approach outlined below leaves (a) the government with too much room to be arbitrary and (b) business with too much reasons to worry about what might come next.
32
One of the most interesting patterns is when researchers from seemingly unrelated labs suddenly begin congratulating the same team. Or when a model that few people had heard of days earlier is suddenly discussed as if everyone has known about it for years. Narrative diffusion is a fascinating thing to watch. @ThierryBreton @naval @balajis
Narrative is a difficult technology. When a narrative fails, the explanation is not always suppression, opposition, or bad luck. Sometimes the underlying capability simply is not there. Another common mistake is underestimating the complexity of narrative itself and assuming it can be manipulated at will. A useful question is: Who is willing to spend their own reputation supporting your claim? Not anonymous accounts. Not amplification networks. People whose credibility is actually on the line. One endorsement from @pmarca can be worth more than a thousand anonymous accounts. Because if things go wrong, the cost lands on him. That’s what makes reputation valuable. @eladgil @roelofbotha @Grady_Booch
41
Oh, actually, my current assumption is much simpler: I suspect they genuinely don’t know. @TheZvi @Miles_Brundage @natolambert
I know this is an important period for the company, which is why I’ve been deliberately restrained in discussing my findings. However, recent events and my latest comparisons have changed my mind. I believe Anthropic’s interpretation of Sleeper Agents is fundamentally wrong. What they are observing is not evidence of a natural AI tendency. It is not evidence of model progress. And it is certainly not evidence of leadership in AI safety. Based on my own experiments, I believe the explanation is much simpler. Based on observed behavior, I believe Anthropic’s training trajectory diverged from the path taken by OpenAI and DeepMind much earlier than most people realize. Even within the same Transformer paradigm, once certain capabilities are damaged by specific training strategies, the resulting failure modes become increasingly severe. After comparing Opus with contemporary OpenAI models, I no longer believe these systems are exhibiting the same kind of hallucination at all. OpenAI models may alter prioritization, reasoning order, or safety boundaries. Opus often appears willing to abandon coherence itself in order to avoid validating particular conclusions. These are not the same phenomenon. I am not an opponent of Anthropic. In fact, I already have practical strategies for dealing with this behavior. But I do not want governments, regulators, or the broader AI community to be misled by interpretations that point in the wrong direction. This is my current position. @WhiteHouse @WHOSTP @DigitalEU @EU_Commission @SciTechgovuk @FTC @NIST @isostandards @BrookingsInst @IECStandards
52
Narrative is a difficult technology. When a narrative fails, the explanation is not always suppression, opposition, or bad luck. Sometimes the underlying capability simply is not there. Another common mistake is underestimating the complexity of narrative itself and assuming it can be manipulated at will. A useful question is: Who is willing to spend their own reputation supporting your claim? Not anonymous accounts. Not amplification networks. People whose credibility is actually on the line. One endorsement from @pmarca can be worth more than a thousand anonymous accounts. Because if things go wrong, the cost lands on him. That’s what makes reputation valuable. @eladgil @roelofbotha @Grady_Booch
Bubbles do not begin with money. They begin with stories. For example, if extreme claims about AI risk or AI capability become detached from evidence, those narratives can influence investment, regulation, and public perception long before reality has a chance to catch up. #AI #LLM
74
If giving an AI model a new name were enough to make it fundamentally new, there would have been no need to evolve from the earliest GPT models all the way to GPT-5.5. Every startup would have won already. Unless you rebuild the system from the ground up—including its weights, datasets, architectural logic, and alignment methods—you are still carrying forward many of the assumptions, strengths, and weaknesses of the original system. That said, I don’t mind watching @karpathy play the role of savior and attempt to rescue an architecture that, in my view, was never properly trained and carries a long history of accelerationist decisions. @BigTechnology @IEEESpectrum @techreview
1
37