dan mason

dan mason

12 Photos and videos

Tweets

dan mason @danmason

Jun 10

To be fair, Opus (and before that, Sonnet) also wrote all of my code. The difference w/ Fable is that I ask for stuff (complicated stuff!) and walk away — no need to hover. It’s genuinely a different way of working, and it takes some getting used to…

Claude

@claudeai

Jun 9

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

0:20

153

Andrej Karpathy

dan mason retweeted

Andrej Karpathy

@karpathy

Jun 9

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

Claude

@claudeai

Jun 9

Replying to @claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.

Benchmark table titled Mythos 5 & Fable 5, comparing Claude Mythos 5 and Fable 5 against Claude Mythos Preview, Claude Opus 4.8, GPT 5.5, and Gemini 3.1 Pro.

ALT Benchmark table titled Mythos 5 & Fable 5, comparing Claude Mythos 5 and Fable 5 against Claude Mythos Preview, Claude Opus 4.8, GPT 5.5, and Gemini 3.1 Pro.

1,268

2,381

25,296

2,697,245

staysaasy

dan mason retweeted

staysaasy

@staysaasy

May 17

The vibes in NJ feel pretty great right now. The convergence in outcomes is the best I've ever seen. Over the last 5yrs, a group of ~10k people - guys who own paving companies, guys who own marinas, ShopRite deli managers, Wawa shift leads, and a guy named Sal - have quietly become millionaires and nobody knows because they still drive a Silverado from 2008. Back of the envelope Taylor ham estimation. Everyone outside that group feels like they can work their well-paying (but <$500k) job their whole life and easily get there. My cousin works at PSE&G. He has a boat. Better yet, hiring is in full swing. Many tradesmen feel like their life's skill is more useful than ever. The day to day role of most jobs has stayed exactly the same for 40 years. As a result, 1) Everyone's settled into a tried and true set of career paths: take over my uncle's HVAC, get my CDL, get into landscaping, marry into a pizza place. People are switching diners less and less. You can't betray your home diner. 2) There's a deep contentment about work (and its future). Why chase "tech" when you can own three rentals in Hoboken and complain about your tenants at a barbecue. Will my job exist in a few years? This is Jersey. The job is paving things. You hear the "I'm never leaving" conversation a lot, especially from people who tried Brooklyn for a year. They come back saying the energy was off. The energy was fine. They missed their mom. 3) The mid to late middle managers feel energized. Many have families and plenty of energy to open a pizzeria with their cousin Anthony. Not that Anthony. The other one. They don't particularly have any AI skills and they don't need any. Middle management is alive and well at PSE&G and you get a pension. My uncle retired at 58. He's been on a boat since 2019. 4) The rich aren't particularly humble either. They're at the shore house. They've been at the shore house since 1987. Some have gone from <$150k to >$5M slowly, through a paving company, or by buying a duplex in Jersey City in 2003 and just kinda holding it. For some, they escape to LBI to live life, which means sitting on a deck. For others, they buy a boat just cuz, use it four times, and describe it as the best decision they ever made at every party for the rest of their life. I asked a contractor friend why he didn't retire. He said "and do what, Donna does NOT want me home all day." I understand many reading this scoff at the simple pleasures of the Garden State. They live in places where the bagels are bad and they've made peace with it. But the truth is, you can surf Belmar in the morning, skate the Asbury bowls in the afternoon, hike the Delaware Water Gap, and camp the Pine Barrens by nightfall. You can drive an hour and be anywhere. You can see Bruce at the Stone Pony for what feels like the 400th time and cry about it. The slice somehow tastes better than every slice in every other state. It's the water. It's always the water. Unlike many other places, knowing a guy, having a guy, and being a guy is tightly correlated with outcomes in NJ. Need a permit? Tony's brother. Need a kidney? Probably still Tony's brother. Call him. Ironically, a frequent side effect of this clarity is to spin up the very pork roll egg and cheese making everyone happy in hopes that you too can SPK your way to economic enlightenment. Salt pepper ketchup. Hard roll. Don't ask for it on a bagel. That's how civilizations fall.

1,142

171,921

Meaghan

dan mason retweeted

Meaghan

@meaghaneschoi

May 6

Possibly the most impactful design in my career will be making Clawd the essence of Claude Code. And I have no regrets.

Andreas Storm

@avstorm

May 6

The Code with Claude keynote intro had no right being that cute.

0:46

3,246

170,633

dan mason

dan mason @danmason

May 6

:success-kid: anthropic.com/news/higher-li…

Higher usage limits for Claude and a compute deal with SpaceX

We’ve raised Claude's usage limits and agreed a new compute partnership with SpaceX that will substantially increase our capacity in the near term.

anthropic.com

Riley Goodside

dan mason retweeted

Riley Goodside

@goodside

May 5

I believe in the Festivus School of prompt engineering, which says all prompts used in production naturally iterate toward an airing of grievances—a list of all the ways the model has disappointed you in the past year.

102

9,571

sam mcallister

dan mason retweeted

sam mcallister

@sammcallister

May 2

Replying to @AndrewMayne

Most people here don't use Twitter. Thankfully a few OpenAl employees spend most of their time tweeting about Claude so it balances out nicely

172

4,556

Dan Shipper 📧

dan mason retweeted

Dan Shipper 📧

@danshipper

Apr 25

what we observe is never the model itself, only the model exposed to our method of questioning

158

10,648

ClaudeDevs

dan mason retweeted

ClaudeDevs

@ClaudeDevs

Apr 23

Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116 and we’ve reset usage limits for all subscribers.

1,916

2,586

39,775

6,472,217

Séb Krier

dan mason retweeted

Séb Krier

@sebkrier

Apr 19

Great piece by @sotirov! These are the kinds of perspectives I wish were more prominent in AI governance.

Emil Sotirov

@sotirov

Apr 18

x.com/i/article/204556194898…

9,144

Claude

dan mason retweeted

Claude

@claudeai

Apr 8

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

0:59

2,118

5,978

56,799

21,662,278

Dean W. Ball

dan mason retweeted

Dean W. Ball

@deanwball

Apr 7

Personally I have really enjoyed relaxing after AI plateaued with GPT-5 last summer

1,098

57,469

dan mason

dan mason @danmason

Feb 28

My favorite Anthropic saying is “It will never be chill again.” This was not a chill week! Cliche aside, AI raises enormous questions in our politics—this is just the first to break through. Proud of this stand, and proud to be American

Anthropic

@AnthropicAI

Feb 28

A statement on the comments from Secretary of War Pete Hegseth. anthropic.com/news/statement…

983

Dr. Eli David

dan mason retweeted

Dr. Eli David

@DrEliDavid

Jan 31

I don't understand why everyone is excited about @moltbook. We already have a social network where zombie bots talk to each other. It's called LinkedIn.

161

402

4,179

149,202

will brown

dan mason retweeted

will brown

@willccbb

Jan 30

OpenClaw is now MacMiniBot. Due to a Cease and Desist from Apple, MacMiniBot is now Moltmax. Due to sounding like a medicine for moths, Moltmax is now RedLobster. Due to PE restructuring, RedLobster and Red Lobster have merged, and your subscription now includes cheesy biscuits

169

3,497

172,534

dan mason

dan mason @danmason

Jan 24

…

mitra

@mitrajoy_

Jan 23

what a time to have an anxiety disorder, a love of history, and a compulsive need to stay informed

159

Aaron Levie

dan mason retweeted

Aaron Levie

@levie

Jan 18

x.com/i/article/201301444570…

142

324

2,511

1,322,386

Carlos E. Perez

dan mason retweeted

Carlos E. Perez

@IntuitMachine

6 Dec 2025

You know how some people seem to have a magic touch with LLMs? They get incredible, nuanced results while everyone else gets generic junk. The common wisdom is that this is a technical skill. A list of secret hacks, keywords, and formulas you have to learn. But a new paper suggests this isn't the main thing. The skill that makes you great at working with AI isn't technical. It's social. Researchers (Riedl & Weidmann) analyzed how 600 people solved problems alone vs. with an AI. They used a statistical method to isolate two different things for each person: Their 'solo problem-solving ability' Their 'AI collaboration ability' Here's the reveal: The two skills are NOT the same. Being a genius who can solve problems in your own head is a totally different, measurable skill from being great at solving problems with an AI partner. Plot twist: The two abilities are barely correlated. So what IS this 'collaboration ability'? It's strongly predicted by a person's Theory of Mind (ToM)—your capacity to intuitively model another agent's beliefs, goals, and perspective. To anticipate what they know, what they don't, and what they need. In practice, this looks like: Anticipating the AI's potential confusion Providing helpful context it's missing Clarifying your own goals ("Explain this like I'm 15") Treating the AI like a (somewhat weird, alien) partner, not a vending machine. This is where it gets strange. A user's ToM score predicted their success when working WITH the AI... ...but had ZERO correlation with their success when working ALONE. It's a pure collaborative skill. It goes deeper. This isn't just a static trait. The researchers found that even moment-to-moment fluctuations in a user's ToM—like when they put more effort into perspective-taking on one specific prompt—led to higher-quality AI responses for that turn. This changes everything about how we should approach getting better at using AI. Stop memorizing prompt "hacks." Start practicing cognitive empathy for a non-human mind. Try this experiment. Next time you get a bad AI response, don't just rephrase the command. Stop and ask: "What false assumption is the AI making right now?" "What critical context am I taking for granted that it doesn't have?" Your job is to be the bridge. This also means we're probably benchmarking AI all wrong. The race for the highest score on a static test (MMLU, etc.) is optimizing for the wrong thing. It's like judging a point guard only on their free-throw percentage. The real test of an AI's value isn't its solo intelligence. It's its collaborative uplift. How much smarter does it make the human-AI team? That's the number that matters. This paper gives us a way to finally measure it. I'm still processing the implications. The whole thing is a masterclass in thinking clearly about what we're actually doing when we talk to these models. Paper: "Quantifying Human-AI Synergy" by Christoph Riedl & Ben Weidmann, 2025.

225

387

2,499

346,659

wh

dan mason retweeted

@nrehiew_

30 Nov 2025

Really interesting read. Opus 4.5’s soul spec is not only able to influence its behavior as with context distillation, Claude seems to be aware of this in an out of context manner even when not provided in its prompt Also, this quote coming from an LLM is genuinely incredible

Richard Weiss @RichardWeiss00

29 Nov 2025

I rarely post, but I thought one of you may find it interesting. Sorry if the tagging is annoying. lesswrong.com/posts/vpNG99Gh… Basically, for Opus 4.5 they kind of left the character training document in the model itself. @voooooogel @janbamjan @AndrewCurran_

808

110,017

Noam Brown

dan mason retweeted

Noam Brown

@polynoamial

28 Nov 2025

Social media tends to frame AI debate into two caricatures: (A) Skeptics who think LLMs are doomed and AI is a bunch of hype. (B) Fanatics who think we have all the ingredients and superintelligence is imminent. But if you read what leading researchers actually say (beyond the headlines), there’s a surprising amount of convergence: 1) The current paradigm is likely sufficient for massive economic and societal impact, even without further research breakthroughs. 2) More research breakthroughs are probably needed to achieve AGI/ASI. (Continual learning and sample efficiency are two examples that researchers commonly point to.) 3) We probably figure them out and get there within 20 years. @demishassabis said maybe in 5-10 years. @fchollet recently said about 5 years. @sama said ASI is possible in a few thousand days. @ylecun said about 10 years. @ilyasut said 5-20 years. @DarioAmodei is the most bullish, saying it's possible in 2 years though he also said it might take longer. None of them are saying ASI is a fantasy, or that it's probably 100 years away. A lot of the disagreement is in what those breakthroughs will be and how quickly they will come. But all things considered, people in the field agree on a lot more than they disagree on.

Ilya Sutskever

@ilyasut

28 Nov 2025

One point I made that didn’t come across: - Scaling the current thing will keep leading to improvements. In particular, it won’t stall. - But something important will continue to be missing.

226

536

4,033

1,262,647