Joined July 2024
93 Photos and videos
"I feel confident that even [Anthropic] will do the right thing"
Sam Altman on Anthropic: "They've built a company on hating us or something like that. I think we all care about not destroying the world with AI." — #sama
2
1
17
3,201
Bayesian @LessOnline&@Manifest retweeted
Jun 8
now on the eve of RSI it seems everyone is more mutual conditional pause agreement pilled than they used to be and that seems like a good development
158
86
1,804
274,201
Bayesian @LessOnline&@Manifest retweeted
Understanding all the causes of increased disclosures is complicated. But we observe a sharp uptick in High and Critical CVEs around the time of Anthropic’s release of Mythos Preview to Project Glasswing partners in late March. OpenAI’s Daybreak cybersecurity program also launched in May.
2
11
84
17,294
It’s wild to me that the labs keep increasing usage limits for their subscription products without a corresponding decrease in api token costs. It must pay a *lot* in enterprise revenue to subsidize subscriptions for individuals.
We've doubled usage limits in Claude Cowork for the next month. Delegate bigger, more complex tasks to Claude.
4
30
4,841
This is absolutely hilarious because it will happen in 2-3 months
Replying to @Research_FRI
🧑‍💻 Experts predict that AI will complete eight-hour tasks with 80% success by 2030 @METR_Evals' 80% task-completion time horizon benchmark measures the duration over which an AI agent is predicted to succeed at a task 80% of the time. We asked panelists to forecast when an AI model will achieve an 80% success rate on software tasks requiring 8 hours or more of human effort. Median forecasts: Experts: 2030 Superforecasters: 2028 General public: 2037
8
5
248
22,421
Bayesian @LessOnline&@Manifest retweeted
The AI boom has doubled computing infrastructure's share of US GDP. Investment in AI-related data center construction, compute hardware, and networking equipment accounted for ~0.8% of US GDP in Q1 2026, driving computing infrastructure as a whole to ~1.5% of GDP.
8
33
152
9,073
Bayesian @LessOnline&@Manifest retweeted
I think it's underappreciated how economically valuable AI safety is. A model that frequently goes off the rails, takes dangerous actions, is misleading or deceptive, etc. is simply much less valuable than a model that does not do that.
We are seeking research engineers who will build evals that test for misaligned model behavior.
26
33
533
105,371
Bayesian @LessOnline&@Manifest retweeted
May 31
This benchmark is great! A model that I like scores highly and a model that I dislike scores poorly. This benchmark is slop! A model that I dislike is at the top of the rankings. How can that be possible? I have taste!
5
3
43
2,372
remarkable honesty
what did @tszzl see
3
46
7,397
Bayesian @LessOnline&@Manifest retweeted
Interesting how different are frontier labs' notions of progress towards AGI. OpenAI: "we've disproven an old conjecture in math" Anthropic: "we've discovered ALL the vulnerabilities" DeepSeek: "we've made context free" Google DeepMind: "we've reduced the batch size for Flash"
20
28
585
21,096
Bayesian @LessOnline&@Manifest retweeted
At some point in the next few months people are going to have to come around to the idea that we weren’t hyping it up for investors.
2
1
44
1,309
Bayesian @LessOnline&@Manifest retweeted
Called it
In the next few days we'll be ramping up Claude inference on Colossus. Grateful to be partnering with SpaceX here. We are going to need to move a lot of atoms in order to keep up with AI demand, and there's nobody better at quickly moving atoms (on or off planet Earth)
1
8
401
Saw this one coming last month!
4
33
731
Correction: had forgotten about this but the origin for the idea was @HenriLemoine13 and i was skeptical
4
314
Bayesian @LessOnline&@Manifest retweeted
Replying to @NotTomBrown
Same here. By way of background for those who care, I spent a lot of time last week with senior members of the Anthropic team to understand what they do to ensure Claude is good for humanity and was impressed. Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector. So long as they engage in critical self-examination, Claude will probably be good. After that, I was ok leasing Colossus 1 to Anthropic, as SpaceXAI had already moved training to Colossus 2.
1,411
2,278
27,810
3,163,706
Reasoning: they have frontier levels of compute, they scale their datacenters very fast, but they have comparatively little customer demand and comparatively low margins, so it makes sense to sell some of their ~ unused compute
1
12
806
a 93% win rate, that is kind of nuts
Apr 21
Exciting news - GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards! A clean sweep with a record-breaking 242 point lead in Text-to-Image - the largest gap we’ve seen to date. - #1 Text-to-Image (1512), 242 over #2 (Nano-banana-2 with web-search aka gemini-3.1-flash-image) - #1 Single-Image Edit (1513), 125 over #2 (Nano-banana-pro aka gemini-3-pro-image) - #1 Multi-Image Edit (1464), 90 over #2 (Nano-banana-2) No model has dominated Image Arena with margins this wide. Huge congratulations to @OpenAI on this major breakthrough in image generation! More performance breakdowns by category in the thread below.
3
12
280
19,422
and 96% against nano banana
18
792
so close!
NEWS: xAI plans to supply tens of thousands of GPUs to coding startup Cursor to train its upcoming Composer 2.5 AI model, marking a strategic shift toward providing cloud computing services to third-party developers. The arrangement, according to Business Insider, allows Cursor to leverage xAI's massive infrastructure to develop advanced coding capabilities while providing xAI with a new revenue stream to offset data center costs. businessinsider.com/elon-mus…
1
6
1,214