Research @AnthropicAI. Opinions are my own!

Joined April 2021
534 Photos and videos
Alex Albert retweeted
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
12,614
25,789
88,176
90,398,595
Alex Albert retweeted
Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.
45
141
1,030
493,208
Fable feels superhuman at working over long agentic conversations, sometimes to the point where I can't keep up with what it's telling me 😅 This prompt snippet has been the best fix I've found for getting it to write clearly and drop any jargon:
76
29
1,028
63,934
Alex Albert retweeted
We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days). We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right. Making the safeguards visible makes them easier to work around, so keeping them robust to jailbreaks will unfortunately mean more false positives while we improve the classifiers. We're also tuning our bio and cyber classifiers to trigger less often on harmless requests. We know this is frustrating and we’ll do our best to keep this period as short as possible. If you think a request has been mistakenly flagged: run /feedback in Claude Code, click thumbs-down on the fallback in Claude.ai or Cowork, or file the safeguard appeal form for API requests. Your reports help us tune these classifiers and we appreciate your feedback. support.claude.com/en/articl…
669
432
5,084
847,788
Alex Albert retweeted
Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: darioamodei.com/post/policy-…
1,350
2,436
13,574
6,532,445
We've reset usage limits across our products! For those just starting to test Fable, here's four tips for using it more effectively: 1. Give it bigger, more ambitious tasks than what previous models could handle. 2. Use xhigh/high effort as your default for best performance, med for faster interactive sessions. 3. Rework your skills and CLAUDE.mds. Instructions written for prior models anchor Fable to stale patterns, let it use its own judgment first. 4. Move from providing tasks to providing objectives. Describe what done looks like and how to verify it, then let Fable find the path (/loop and /goal are built for this)
To celebrate the Fable 5 launch, we just reset 5-hour and weekly limits for all users across our products! Enjoy 🚀
152
169
2,961
382,762
Alex Albert retweeted
Releasing a model this capable comes with risks. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage. Queries on a narrow range of topics will instead receive a response from our next-most-capable model, Opus 4.8.
167
417
7,676
1,229,198
I've been at Anthropic through every model launch. There's been a few cases I can remember of a launch that stands out and marks a step-change in how we use models: - Claude Opus 3 - Claude Sonnet 3.5 - Claude Opus 4.5 And now Claude Fable 5. With Fable, the model stopped feeling like a tool I direct and started feeling more like something I collaborate with.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
193
129
3,273
606,212
Alex Albert retweeted
We've doubled usage limits in Claude Cowork for the next month. Delegate bigger, more complex tasks to Claude.
803
830
13,380
1,749,933
We just published internal data on how much of Claude's development is already being done by Claude: - Over 80% of all code merged into our codebase is now written by Claude - It's been months since many researchers at Anthropic hand-wrote code - The typical Anthropic engineer ships 8x as much code as they did in 2024 - On the most open-ended engineering tasks, Claude's success rate jumped from ~26% to 76% in 6 months - When research sessions went off-track, Claude proposed a better next step than the human took 64% of the time We're not at recursive self-improvement yet, but it could come sooner than most expect. I highly recommend reading the full blog post.
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
189
216
3,263
420,758
Alex Albert retweeted
We've reset 5-hour and weekly rate limits for all users on Pro and Max plans. We fixed an issue that caused some Claude Code sessions to spawn excessive parallel subagents, burning through usage faster than expected.
1,110
1,043
20,494
2,551,002
We put a lot of work into calibrating thinking effort for Opus 4.8. As you're trying out the model, if you do run into any examples of it still over/under thinking, please flag it to us!
hello beloved tasteful users, do you like how much claude thinks on your tasks? would love examples of it thinking too much or too little
51
10
433
39,283
Fast mode for Opus 4.8 is much more affordable now. Try it out in Claude Code, I've found it changes how I use Claude. Fast mode for interactive work where I want rapid responses, normal mode for longer async tasks where I don't need results right away.
May 28
Replying to @claudeai
Fast mode is available for Opus 4.8. It's the same model at roughly 2.5x the speed, and we've made it three times cheaper than before. Turn it on with /fast in Claude Code. On the API, contact your account manager to request access or join the waitlist: claude.com/fast-mode
46
11
458
40,797
Alex Albert retweeted
BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on @every: every.to/vibe-check/opus-4-8…
141
145
1,748
350,658
Excited to release Opus 4.8 today! We heard your feedback on 4.7 and have made many fixes for 4.8. 4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.
May 28
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.
171
92
2,091
189,602
Alex Albert retweeted
over the weekend i checked the obvious thing, which is whether mythos is able to solve the erdos unit distance problem, aka erdos problem #90. the answer is: yea
54
143
2,004
624,914
Alex Albert retweeted
Here's my new episode with @alexalbert__, who shared an inside look at how Anthropic is building the next Claude. We talked about how the research team: → Plans for the model and harness together → Uses Claude to turn user feedback into evals → Trains Claude's character & personality Some quotes from Alex: "We use Claude to cluster user feedback, find top themes, and create synthetic versions of user problems that we then turn into evals." "We need to think about how the model is exposed through all our surfaces, whether it's API or Claude Code or Cowork. The product has a blend with the model and that affects your end user's experience." ""As these things become agents running tasks for a long time and making judgment decisions, what its character is and what it cares about are very important." 📌 Watch now: youtu.be/T4ieZPIEmd8 Thanks to our sponsors: @WisprFlow: Don't type, just speak ref.wisprflow.ai/peteryang @oceanstalent: Hire AI-native executive assistants oceanstalent.com/peter
25
17
172
95,178
Alex Albert retweeted
Happy Friday! We've reset everyone's 5-hour and weekly rate limits.
1,603
1,336
31,094
2,076,036
Alex Albert retweeted
Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.
1,366
2,048
22,463
2,781,275