AI Explained

AI Explained

27 Photos and videos

Tweets

AI Explained

@AIExplainedYT

Apr 7

Anthropic: "We do not plan to make Claude Mythos Preview generally available" A big line, buried quite deep. Possible reasons? So many, inc: 1) The model is expensive (25/125), not far off GPT 4.5, which became commercially unviable. Less likely, given the claims about Mythos. 2) They genuinely are worried about unleashing cybersecurity choas on the world. 3) They don't have capacity to serve it at scale yet. 4) They will quickly distil the early access outputs of Mythos into a lighter model, so no need to release the bigger model when a more cost efficient one coming imminently. 5) Other. Not read the 250 page report yet, but will do.

815

94,954

AI Explained

AI Explained

@AIExplainedYT

Jan 29

The Adolescence of Technology is a well-written 20,000-word new essay on what you should expect from the near future of AI. I read it in full every footnote and link, and have these 10 questions (of a type not asked at Davos) for @DarioAmodei, the essay author and CEO of Anthropic, makers of Claude. 1/12

Dario Amodei

@DarioAmodei

Jan 26

The Adolescence of Technology: an essay on the risks posed by powerful AI to national security, economies and democracy—and how we can defend against them: darioamodei.com/essay/the-ad…

10,089

more replies

AI Explained

AI Explained

@AIExplainedYT

Jan 29

8. Can you describe the tipping point when you decided to switch from training Claude to ‘avoid implying it had a personal identity’ in ‘23-24 to ‘encourag[ing] Claude to think of itself as a particular type of person’ in ‘25-’26? 9/12

4,561

AI Explained

AI Explained

@AIExplainedYT

Jan 29

9. In 2023, you predicted that ‘AI systems may facilitate extraordinary insights in broad swaths of many science and engineering disciplines’ by ‘24-25 but did you mean purely LLMs, (in which case are you disappointed?) or if you meant systems like Google’s WeatherNext or AlphaEvolve, why have Anthropic never publicly posted about/worked on neuro-symbolic or non-LLM systems? 10/12 10. Do you acknowledge the conflict of interest you could be perceived to have, in that you are calling to stop China getting Nvidia chips while at the same time it is those open-weight Chinese models, and scaffolds like Kimi Code, that could most threaten Anthropic’s revenue? 11/12

3,987

AI Explained

AI Explained

@AIExplainedYT

20 Nov 2025

Nano Banana Pro drew an admirably edgy Rake's Progress, 2025-edition.

7,260

AI Explained

AI Explained

@AIExplainedYT

13 Aug 2025

If you use GPT-5 Pro for coding, you will swiftly realize that it will never agree to anything, even its own suggestions, without adding 'two quick tweaks'. It's a pathological perfectionist. *Still very useful, just strange in this particular way. **Gave Pro this tweet and it suggested this new version, with 'two tweaks': "If you use GPT‑5 Pro for coding, you’ll quickly realize it never accepts anything—even its own suggestions—without adding “two quick tweaks.” It’s a pathological perfectionist. Still useful, just strange in this particular way." ***Gave Pro that tweet, and it had 'two tiny notes': "When posting, drop the outermost quotation marks (they’re just framing here). Check you’re within the 280‑character limit (~229 chars, including line breaks)."

375

51,209

AI Explained

AI Explained

@AIExplainedYT

28 May 2025

2 quick updates, and look-ahead, exactly a year on from first testing models on Simple-Bench: 1) Claude 4 busted our rate limits, and my entreaties to @AnthropicAI (to allow us to spend more money!) have yet to bear fruit. A shame, as am fairly confident Opus 4 would be SOTA. 2) Gemini 2.5 Pro 05-06 and Flash 05-20 (the latest versions) are actually a slight downgrade in both performance and instruction-following and the one full run we got out of 2.5 Pro got 46% (below the previous version's 51%). We would prefer to get an AVG@5, for fairness, before posting on the leaderboard. Thoughts: RL becoming 20% of the compute spend for frontier models may have more strange side effects than labs were anticipating. 'Over-eagerness' over simply following commands seems barely under control. On Simple, I had been fairly confident it would be saturated (>80-85%) by the end of the year. Now I think it is more like 50-50, and progress could instead slow for a while, as models become relentlessly optimised for dollar-maximising tasks, like software engineering, over general nous. Spatial intelligence, like spotting that the glove would fall onto the road, in the question pasted at the bottom of this tweet, is simply not yet as lucrative. As ever, grateful to @weights_biases and @Ag_Mlynarczyk in particular for keeping the show on the road. Q. A luxury sports-car is traveling north at 30km/h over a roadbridge, 250m long, which runs over a river that is flowing at 5km/h eastward. The wind is blowing at 1km/h westward, slow enough not to bother the pedestrians snapping photos of the car from both sides of the roadbridge as the car passes. A glove was stored in the trunk of the car, but slips out of a hole and drops out when the car is half-way over the bridge. Assume the car continues in the same direction at the same speed, and the wind and river continue to move as stated. 1 hour later, the water-proof glove is (relative to the center of the bridge) approximately? Models (super-trained on HS Math): 4km East

308

36,541

Nathan Labenz

AI Explained retweeted

Nathan Labenz

@labenz

30 Jan 2025

The shift from a cautious, collaborative attitude wrt China not long ago to a zero-sum competitive outlook today - by both Dario and Sam - has been very disappointing - and they’ve offered no explanation for the change!

ControlAI

@ControlAI

30 Jan 2025

Dario Amodei (2017) warning of the dangers of US-China AI racing: "that can create the perfect storm for safety catastrophes to happen"

0:45

496

56,931

Weights & Biases

AI Explained retweeted

Weights & Biases

@wandb

13 Jan 2025

🪄 Think you’re an AI wizard? Prove it. We’ve partnered w/ @AIExplainedYT to launch the Simple Bench Evals Competition—a challenge so tough, he said: “If anyone gets 20/20 with a general-purpose prompt, I would be truly shocked.” 😳 Details below 👇

12,359

AI Explained

AI Explained

@AIExplainedYT

18 Dec 2024

A shot of the AI Explained studio in the Shard, London...(courtesy of Veo 2)

0:08

225

19,126

AI Explained

AI Explained

@AIExplainedYT

19 Sep 2024

Are we in a new, 3rd paradigm for AI language models? First, models predicted the most likely next word. Think 2018-2021, for transformer-based language models. Second, they were rewarded for words that were helpful, harmless, and honest. Think RLHF or RLAIF, 2022-2023. Now, with the o1 family, they are being rewarded for being objectively correct. Think 2024-??? Breakdown in video below:

571

57,881

AI Explained

AI Explained

@AIExplainedYT

19 Sep 2024

youtube.com/watch?v=KKF7kL0p…

o1 - What is Going On? Why o1 is a 3rd Paradigm of Model 10 Things...

o1 is different, and even sceptics are calling it a 'large reasonin...

youtube.com

108

17,006