shb

shb

92 Photos and videos

Tweets

Alek Dimitriev retweeted

shb

@himbodhisattva

Jun 13

opus 4.8 with the fable context is some real flowers for algernon shit

137

3,349

257,495

Arsh Shah Dilbagi

Alek Dimitriev retweeted

Arsh Shah Dilbagi

@arshdilbagi

Jun 13

Introducing Adaline 2.0 - The Agent Self-Improvement Layer Adaline turns Traces into Behaviors, Behaviors surface Issues, Issues become auto-generated Evals Data, Adaline then generates new agent candidates and tests them. You review the winners and ship!

1:34

115

3,194

753

848,762

Anthropic

Alek Dimitriev retweeted

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

The US government has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States.

anthropic.com

12,522

25,755

87,892

89,556,797

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

Nvidia is nothing without its people

Kalshi

@Kalshi

Jun 9

JUST IN: Nvidia is now worth more than India

1,482

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

yann lecun in shambles

matt

@MattVMacfarlane

Jun 9

Was using Fable 5 to write my world model training code. Anthropic flagged it as frontier AI research. The steering vector kicked in and it started implementing JEPA 🤨

8,098

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

I asked Fable for an original joke and AFAICT this is genuinely novel and not bad! The shul roof springs a leak. The rabbi stays up all night preparing his case to bring before the Almighty — citations from the prophets, the accumulated merits of the congregation's grandparents, and for the closing argument, a pointed reminder of what He once did with forty days of rain. At dawn, before the rabbi can deliver a single word of it, there's a knock at the door. A stranger, moved by a dream, hands over the full cost of a new roof. The congregation celebrates. The rabbi sulks for a week. Finally his wife demands to know what's wrong. "He settled out of court."

708

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

Karpathy joins Anthropic. Anthropic releases Fable. Coincidence? I think not!

Andrej Karpathy

@karpathy

Jun 9

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

6,168

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

A flat performance on a benchmark with increasing test time compute sometimes means that the models are not good enough right now, but they will be soon enough. Mythos shatters the SOTA with a clean trend-line.

Cognition

@cognition

Jun 8

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40 hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

1,194

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

We have scores normalized by test time compute in our Mythos launch for many benchmarks!

Noam Brown

@polynoamial

Jun 9

We've known about LLM test-time compute scaling since @OpenAI o1. Yet 2 years later labs still report scalar evals for models; safety orgs are still surprised when a scaffold does better via 100x inference; and RSPs still ignore inference budget when deciding critical thresholds.

387

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

Fable 5 has entered the chat. It’s the same underlying model as Mythos 5, but with extra safeguards. It was a lot of effort to figure out how to generally release it, but now that we’ve developed robust safeguards around it, we can’t wait to get it into everyone’s hands.

Claude

@claudeai

Jun 9

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

0:20

559

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 9

We recently published our two stage probe classifier setup at ICLR if you want to learn more: openreview.net/forum?id=eNvs…

Constitutional Classifiers : Efficient Production-Grade Defenses...

We introduce enhanced Constitutional Classifiers that deliver production-grade jailbreak robustness with dramatically reduced computational costs and refusal rates compared to previous-generation...

openreview.net

113

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

Jun 4

In case you're wondering, yes we're feeling the AGI.

Anthropic

@AnthropicAI

Jun 4

Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…

1,410

217,941

Maarten Boudry

Alek Dimitriev retweeted

Maarten Boudry

@mboudry

May 30

Amazing graph. One the best visualizations of human progress.

142

1,202

98,666

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

May 28

New Opus today and a new reduced price for fast mode on Opus 4.8! Fast mode was six times more expensive, but is now only 2x the price for 2.5x the speed, try it out!

Claude

@claudeai

May 28

Replying to @claudeai

Fast mode is available for Opus 4.8. It's the same model at roughly 2.5x the speed, and we've made it three times cheaper than before. Turn it on with /fast in Claude Code. On the API, contact your account manager to request access or join the waitlist: claude.com/fast-mode

1,243

levent

Alek Dimitriev retweeted

levent

@__alpoge__

May 26

over the weekend i checked the obvious thing, which is whether mythos is able to solve the erdos unit distance problem, aka erdos problem #90. the answer is: yea

143

2,004

624,698

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

May 22

He cut the best part, Mythos's reaction!

Elon Musk

@elonmusk

May 22

Humans using Mythos as seen by Mythos

1,420

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

May 21

Please be careful though!

Alek Dimitriev

@tensor_rotator

May 21

Just like SF is the AI epicenter, it is also the weight loss peptide mecca. If you live here and aren’t using peptides, you at least know many people who do. The cutting edge is currently GLP-3 retatrutide, and Eli Lilly’s phase 3 trial results are out: "participants on 12 mg lost an average of 70.3 Ibs (28.3%) over 80 weeks." Incredible results!

2,388

Alek Dimitriev

Alek Dimitriev

@tensor_rotator

May 21

57,983