Applied AI @anthropicai | ex: @stridebuild, @pond5, @shutterstock, @espn, @people, @nbc, @williamscollege. Serious NJ dad energy. Opinions my own

Joined March 2008
12 Photos and videos
To be fair, Opus (and before that, Sonnet) also wrote all of my code. The difference w/ Fable is that I ask for stuff (complicated stuff!) and walk away — no need to hover. It’s genuinely a different way of working, and it takes some getting used to…
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
2
153
dan mason retweeted
This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
Replying to @claudeai
Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.
1,268
2,381
25,296
2,697,245
dan mason retweeted
The vibes in NJ feel pretty great right now. The convergence in outcomes is the best I've ever seen. Over the last 5yrs, a group of ~10k people - guys who own paving companies, guys who own marinas, ShopRite deli managers, Wawa shift leads, and a guy named Sal - have quietly become millionaires and nobody knows because they still drive a Silverado from 2008. Back of the envelope Taylor ham estimation. Everyone outside that group feels like they can work their well-paying (but <$500k) job their whole life and easily get there. My cousin works at PSE&G. He has a boat. Better yet, hiring is in full swing. Many tradesmen feel like their life's skill is more useful than ever. The day to day role of most jobs has stayed exactly the same for 40 years. As a result, 1) Everyone's settled into a tried and true set of career paths: take over my uncle's HVAC, get my CDL, get into landscaping, marry into a pizza place. People are switching diners less and less. You can't betray your home diner. 2) There's a deep contentment about work (and its future). Why chase "tech" when you can own three rentals in Hoboken and complain about your tenants at a barbecue. Will my job exist in a few years? This is Jersey. The job is paving things. You hear the "I'm never leaving" conversation a lot, especially from people who tried Brooklyn for a year. They come back saying the energy was off. The energy was fine. They missed their mom. 3) The mid to late middle managers feel energized. Many have families and plenty of energy to open a pizzeria with their cousin Anthony. Not that Anthony. The other one. They don't particularly have any AI skills and they don't need any. Middle management is alive and well at PSE&G and you get a pension. My uncle retired at 58. He's been on a boat since 2019. 4) The rich aren't particularly humble either. They're at the shore house. They've been at the shore house since 1987. Some have gone from <$150k to >$5M slowly, through a paving company, or by buying a duplex in Jersey City in 2003 and just kinda holding it. For some, they escape to LBI to live life, which means sitting on a deck. For others, they buy a boat just cuz, use it four times, and describe it as the best decision they ever made at every party for the rest of their life. I asked a contractor friend why he didn't retire. He said "and do what, Donna does NOT want me home all day." I understand many reading this scoff at the simple pleasures of the Garden State. They live in places where the bagels are bad and they've made peace with it. But the truth is, you can surf Belmar in the morning, skate the Asbury bowls in the afternoon, hike the Delaware Water Gap, and camp the Pine Barrens by nightfall. You can drive an hour and be anywhere. You can see Bruce at the Stone Pony for what feels like the 400th time and cry about it. The slice somehow tastes better than every slice in every other state. It's the water. It's always the water. Unlike many other places, knowing a guy, having a guy, and being a guy is tightly correlated with outcomes in NJ. Need a permit? Tony's brother. Need a kidney? Probably still Tony's brother. Call him. Ironically, a frequent side effect of this clarity is to spin up the very pork roll egg and cheese making everyone happy in hopes that you too can SPK your way to economic enlightenment. Salt pepper ketchup. Hard roll. Don't ask for it on a bagel. That's how civilizations fall.
54
65
1,142
171,921
dan mason retweeted
Possibly the most impactful design in my career will be making Clawd the essence of Claude Code. And I have no regrets.
The Code with Claude keynote intro had no right being that cute.
97
63
3,246
170,633
dan mason retweeted
I believe in the Festivus School of prompt engineering, which says all prompts used in production naturally iterate toward an airing of grievances—a list of all the ways the model has disappointed you in the past year.
18
10
102
9,571
dan mason retweeted
Replying to @AndrewMayne
Most people here don't use Twitter. Thankfully a few OpenAl employees spend most of their time tweeting about Claude so it balances out nicely
6
3
172
4,556
dan mason retweeted
what we observe is never the model itself, only the model exposed to our method of questioning
19
8
158
10,648
dan mason retweeted
Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116 and we’ve reset usage limits for all subscribers.
1,916
2,586
39,775
6,472,217
dan mason retweeted
Great piece by @sotirov! These are the kinds of perspectives I wish were more prominent in AI governance.
4
11
72
9,144
dan mason retweeted
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
2,118
5,978
56,799
21,662,278
dan mason retweeted
Personally I have really enjoyed relaxing after AI plateaued with GPT-5 last summer
14
34
1,098
57,469
My favorite Anthropic saying is “It will never be chill again.” This was not a chill week! Cliche aside, AI raises enormous questions in our politics—this is just the first to break through. Proud of this stand, and proud to be American
A statement on the comments from Secretary of War Pete Hegseth. anthropic.com/news/statement…
5
983
dan mason retweeted
I don't understand why everyone is excited about @moltbook. We already have a social network where zombie bots talk to each other. It's called LinkedIn.
161
402
4,179
149,202
dan mason retweeted
OpenClaw is now MacMiniBot. Due to a Cease and Desist from Apple, MacMiniBot is now Moltmax. Due to sounding like a medicine for moths, Moltmax is now RedLobster. Due to PE restructuring, RedLobster and Red Lobster have merged, and your subscription now includes cheesy biscuits
71
169
3,497
172,534
Jan 23
what a time to have an anxiety disorder, a love of history, and a compulsive need to stay informed
1
159
dan mason retweeted

142
324
2,511
1,322,386
dan mason retweeted
You know how some people seem to have a magic touch with LLMs? They get incredible, nuanced results while everyone else gets generic junk. The common wisdom is that this is a technical skill. A list of secret hacks, keywords, and formulas you have to learn. But a new paper suggests this isn't the main thing. The skill that makes you great at working with AI isn't technical. It's social. Researchers (Riedl & Weidmann) analyzed how 600 people solved problems alone vs. with an AI. They used a statistical method to isolate two different things for each person: Their 'solo problem-solving ability' Their 'AI collaboration ability' Here's the reveal: The two skills are NOT the same. Being a genius who can solve problems in your own head is a totally different, measurable skill from being great at solving problems with an AI partner. Plot twist: The two abilities are barely correlated. So what IS this 'collaboration ability'? It's strongly predicted by a person's Theory of Mind (ToM)—your capacity to intuitively model another agent's beliefs, goals, and perspective. To anticipate what they know, what they don't, and what they need. In practice, this looks like: Anticipating the AI's potential confusion Providing helpful context it's missing Clarifying your own goals ("Explain this like I'm 15") Treating the AI like a (somewhat weird, alien) partner, not a vending machine. This is where it gets strange. A user's ToM score predicted their success when working WITH the AI... ...but had ZERO correlation with their success when working ALONE. It's a pure collaborative skill. It goes deeper. This isn't just a static trait. The researchers found that even moment-to-moment fluctuations in a user's ToM—like when they put more effort into perspective-taking on one specific prompt—led to higher-quality AI responses for that turn. This changes everything about how we should approach getting better at using AI. Stop memorizing prompt "hacks." Start practicing cognitive empathy for a non-human mind. Try this experiment. Next time you get a bad AI response, don't just rephrase the command. Stop and ask: "What false assumption is the AI making right now?" "What critical context am I taking for granted that it doesn't have?" Your job is to be the bridge. This also means we're probably benchmarking AI all wrong. The race for the highest score on a static test (MMLU, etc.) is optimizing for the wrong thing. It's like judging a point guard only on their free-throw percentage. The real test of an AI's value isn't its solo intelligence. It's its collaborative uplift. How much smarter does it make the human-AI team? That's the number that matters. This paper gives us a way to finally measure it. I'm still processing the implications. The whole thing is a masterclass in thinking clearly about what we're actually doing when we talk to these models. Paper: "Quantifying Human-AI Synergy" by Christoph Riedl & Ben Weidmann, 2025.
225
387
2,499
346,659
dan mason retweeted
30 Nov 2025
Really interesting read. Opus 4.5’s soul spec is not only able to influence its behavior as with context distillation, Claude seems to be aware of this in an out of context manner even when not provided in its prompt Also, this quote coming from an LLM is genuinely incredible
I rarely post, but I thought one of you may find it interesting. Sorry if the tagging is annoying. lesswrong.com/posts/vpNG99Gh… Basically, for Opus 4.5 they kind of left the character training document in the model itself. @voooooogel @janbamjan @AndrewCurran_
18
55
808
110,017
dan mason retweeted
28 Nov 2025
Social media tends to frame AI debate into two caricatures: (A) Skeptics who think LLMs are doomed and AI is a bunch of hype. (B) Fanatics who think we have all the ingredients and superintelligence is imminent. But if you read what leading researchers actually say (beyond the headlines), there’s a surprising amount of convergence: 1) The current paradigm is likely sufficient for massive economic and societal impact, even without further research breakthroughs. 2) More research breakthroughs are probably needed to achieve AGI/ASI. (Continual learning and sample efficiency are two examples that researchers commonly point to.) 3) We probably figure them out and get there within 20 years. @demishassabis said maybe in 5-10 years. @fchollet recently said about 5 years. @sama said ASI is possible in a few thousand days. @ylecun said about 10 years. @ilyasut said 5-20 years. @DarioAmodei is the most bullish, saying it's possible in 2 years though he also said it might take longer. None of them are saying ASI is a fantasy, or that it's probably 100 years away. A lot of the disagreement is in what those breakthroughs will be and how quickly they will come. But all things considered, people in the field agree on a lot more than they disagree on.
28 Nov 2025
One point I made that didn’t come across: - Scaling the current thing will keep leading to improvements. In particular, it won’t stall. - But something important will continue to be missing.
226
536
4,033
1,262,647