Now I’m convinced. Anthropic baited the US into doing this for the publicity.
- The US brought them a concern.
- Despite their entire existence centering around a neurotic obsession about safety, when presented with a simple request from the government, Anthropic totally switched up their ethos and did not even want to acknowledge it.
- The US telegraphed they would impose controls if Anthropic did not take them seriously
- So Dario—who learned from the DoD situation that indignant opposition to the government which triggers public policy is very lucrative for Anthropic (check the meteoric valuation rise since February)—simply let that happen, knowing that at any point he could just make some nominal changes, reassure the government, and accept a tsunami of new demand, supercharged by the US government rubber stamping his model as “too powerful to exist.”
This whole thing is like a carbon copy of the DoD situation—the government just wanted to feel like Anthropic was on their team, and Dario just vehemently refuses to play ball.
He either has a deeply narcissistic Karen personality disorder where he is totally inflexible and unwilling to offer any reassurance to colleagues in situations when he deems their concerns “beneath him” or “wrong”—so inflexible that he deliberately stays indignant long enough to trigger some negative impact on himself which will allow him to play the victim and bathe in the glory of being the “unjustly maligned party” in the situation…
OR he knows (and was reinforced from last time) how lucrative this kind of public action can be for his company, and deliberately drug his feet to bait them into doing this again, knowing how easy it would be for him to just switch up after the fact and turn all the models back on to a ravenous global audience.
He either accidentally has the perfect personality defects to yield optimal outcomes for Anthropic, or Dario just played the US government like a fiddle…*again.*
I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true:
— As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.
— Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)
— A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
— In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.”
— In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.
— In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community.
— The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority.
— Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.