(for Rust people) Finally got around to writing about Diplomat, the FFI tool I've been building and using for the last five years. Check it out! manishearth.github.io/blog/2…

Diplomat: Multi-language FFI for Rust libraries

This is a post I’ve been meaning to write and publish for years, and only recently got around to doing it. I’m hoping to get back into writing more! For the past few years, as a part of my work on …

manishearth.github.io

1,501

Manish

Manish @ManishEarth

Jun 14

isnt that what they served at fyre festival

Largacty3 𓅃 @largacty3

Jun 13

The American mind cannot comprehend the pub cheese and onion roll

2,340

Manish

Manish @ManishEarth

Jun 13

this week has been a rollercoaster for the ai world and half the Major Posters from that world have been off trying to elect a new pope

464

Manish

Manish @ManishEarth

Jun 13

(there was a papal conclave larp at lighthaven. apparently very good)

261

neural oscillator of uncertain significance

Manish retweeted

neural oscillator of uncertain significance @mycoliza

Jun 13

they gotta print the weights on a t-shirt. idk. they can print them real small. it’ll be fine

106

2,507

Manish

Manish @ManishEarth

Jun 11

complaining about claude not showing "thinking": "claude does not think, therefore claude cannot am"

1,003

Manish

Manish @ManishEarth

Jun 11

incogito ergo sumn't

291

Manish

Manish @ManishEarth

Jun 11

yeah this is me, I think the decision was defensible but also not the ideal one

Celene is at Manifest

@toasterlighting

Jun 11

I no longer have any particular criticisms of Anthropic's actions regarding Fable. I think the decision of making guardrails invisible made sense but was ultimately incorrect, and I'm glad they've changed their minds on this.

314

Sean Heelan

Manish retweeted

Sean Heelan @seanhn

Jun 11

It is a *little* awkward when your CEO is out there complaining people aren’t putting his ideas into law fast enough, yet your own policies can’t survive 24 hours of contact with reality.

ClaudeDevs

@ClaudeDevs

Jun 11

We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days). We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right. Making the safeguards visible makes them easier to work around, so keeping them robust to jailbreaks will unfortunately mean more false positives while we improve the classifiers. We're also tuning our bio and cyber classifiers to trigger less often on harmless requests. We know this is frustrating and we’ll do our best to keep this period as short as possible. If you think a request has been mistakenly flagged: run /feedback in Claude Code, click thumbs-down on the fallback in Claude.ai or Cowork, or file the safeguard appeal form for API requests. Your reports help us tune these classifiers and we appreciate your feedback. support.claude.com/en/articl…

866

26,939

Simon Willison

Manish retweeted

Simon Willison

@simonw

Jun 11

Very pleased to hear Anthropic have walked back this policy simonwillison.net/2026/Jun/1…

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

ALT “We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

SemiAnalysis

@SemiAnalysis_

Jun 9

BREAKING NEWS: Anthropic's latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won't notice. We are already seeing Anthropic's latest model's moderation filters our GPU inference research and programming 😭

1,075

256,088

Manish

Manish @ManishEarth

Jun 10

related: they confirm a pope is dead by tapping them on the head whilst deadnaming them to see if they react

J.J. McCullough

@JJ_McCullough

Jun 10

I didn’t realize popes ever spoke like this, but it’s quite humanizing. Like, emphasizing “I play a character who is different from the real me.”

1,278

51,154

mattparlmer 🪐 🌷

Manish retweeted

mattparlmer 🪐 🌷

@mattparlmer

Jun 10

Cannot think of a more disastrous set of decisions to make ahead of an IPO, the reaction to data policies alone will show up in their revenue figures, to say nothing of cost control measures

MTS

@MTSlive

Jun 10

SITUATION DETECTED: Microsoft is limiting internal employee use of Fable 5 over Anthropic's new data retention requirements, per The Verge. Fable 5 requires data retention to operate its safety classifiers, unlike other Claude models which run under Zero Data Retention rules.

388

16,827

Arthur Tellis

Manish retweeted

Arthur Tellis

@arthurctellis

Jun 10

Seeing a lot of Fable safeguards hate on the timeline, but "what did y'all think [AI safety] meant? vibes? papers? essays?" The reality is that there are real tradeoffs in AI safety. Anthropic deserves credit for aggressive resolution of these tradeoffs in favor of safeguards for a model that it believes (and is in fact) is a step-change in vulnerability research capability. It's kind of difficult to justify coercive proactive harm mitigation, especially in a libertarian-ish society, but we clearly see the value in mandatory vaccination programs or beatcop policing or surveillance cameras. We should applaud Anthropic for being one of the few institutions in American public life that actually follows through on its convictions, including in implementing really aggressive monitoring, squelching of AI development work (already accounted for in its ToS -- I think the clandestinity is cool too), and exclusionary limits on use for information security-related queries. The whole point here is that we do not have herd immunity here: our network edge devices, authentication apps/services, and productivity software are extremely vulnerable, not sandboxed, and lack introspection capabilities. We need programs like Glasswing, better cross-company threat detection, and a more effective APT exploitation strategy before we democratize such a robust vuln research capability. The counterfactual here is that MSS contractors use VPS to access Fable, find jailbreaks for weaker safeguards, and use the system to build an active directory exploit that enables remote access to every O365 app. Not so bueno, huh? This is incredibly hard; Anthropic may not have calibrated every safeguard correctly this time, but there'll be learning. Model release cycles are getting more concise: they will adapt as they better understand and mitigate risks and competitive pressures manifest. Histrionic claims of anti-competitive behavior and safetyist hysteria are victim to precisely the error that is being alleged.

elie

@eliebakouch

Jun 9

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy

144

72,235

Manish

Manish @ManishEarth

Jun 10

i'm really not getting the Fable hate they released something new and shiny and fancy but it's artificially restricted for various reasons? seems ... fine? they could have also not released it or even built it in the first place. why do people feel entitled to Better LLMs?

1,032

more replies

Manish

Manish @ManishEarth

Jun 10

(i think there are some legit reasons to complain about this too, but so much of it seems to just be entitlement)

133

Manish

Manish @ManishEarth

Jun 10

related take: x.com/sergeantsup/status/206…

kip

@sergeantsup

Jun 10

I find this reaction a bit off-putting and entitled. it reminds me of a broader pattern that bothers me, but I'm not sure exactly how to put my finger on it like... anthropic just increased the value of their product to their customers by releasing a powerful new model. they have increased consumer surplus. no one's subscription was contingent on this happening, especially not at a particular time. the release yesterday was a surprise but for some customers, the surplus was increased less. because, worst case, they have to use the new model in incognito but it was still increased, so... why start with a loud complaint instead of appreciation? why be negative instead of positive? doesn't feel right to me I feel like I see this all over the place. like I think it's good when stores are wheelchair accessible, but if a new one pops up and they don't have a ramp, I am more appreciative of its existence than I am disappointed in its subpar accessibility glosso has weird features and I'm not on board with all of them. but the majority of it is very good, so I'm focusing on the good, and I'm super appreciative that Aella's making it happen I've seen this with events too I think, people get so critical all the time, people are putting in hard work to give other people things for free or for cheap. people are creating massive consumer surplus all the time. yet there is still a tendency for others to come at things from such a critical angle I think it's mostly good when people do things and make things. and disproportionate criticism disincentivizes people to do things and make things