关于
@AnthropicAI 下架 Fable 5 比较客观的分析:
1. Fable 本质上是 Mythos 加安全护栏
2. 一个可信测试方发现护栏可被绕过
3. 美国政府担心一旦绕过,普通用户/外国可能就等于拿到了原本 Mythos 不该公开的危险能力
4. 所以美国政府要求 Anthropic 修漏洞,或者先下架模型
5. Anthropic 认为问题不严重,没有按政府要求处理
6. 于是政府动用了「出口管制」,限制外国人/外国实体访问这些模型
7. 但是
@DavidSacks 分析这不是政治报复,也不是之前政府和 Anthropic 争议的延续,只是安全问题
8. 政府希望 Anthropic 修好后,解除限制,让 Fable 重新开放
Anthropic 一直标榜最重视 AI 安全,但这次面对自己模型的「安全漏洞」,却选择继续上线,所以政府才出手限制访问🤔
I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true:
— As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.
— Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)
— A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
— In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.”
— In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.
— In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community.
— The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority.
— Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.