Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

93 Photos and videos

Tweets

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Jun 8

"I feel confident that even [Anthropic] will do the right thing"

AI Insights

@plzaccelerate

Jun 8

Sam Altman on Anthropic: "They've built a company on hating us or something like that. I think we all care about not destroying the world with AI." — #sama

0:40

3,201

roon

Bayesian @LessOnline&@Manifest retweeted

roon

@tszzl

Jun 8

now on the eve of RSI it seems everyone is more mutual conditional pause agreement pilled than they used to be and that seems like a good development

158

1,804

274,201

Epoch AI

Bayesian @LessOnline&@Manifest retweeted

Epoch AI

@EpochAIResearch

Jun 5

Understanding all the causes of increased disclosures is complicated. But we observe a sharp uptick in High and Critical CVEs around the time of Anthropic’s release of Mythos Preview to Project Glasswing partners in late March. OpenAI’s Daybreak cybersecurity program also launched in May.

17,294

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Jun 5

It’s wild to me that the labs keep increasing usage limits for their subscription products without a corresponding decrease in api token costs. It must pay a *lot* in enterprise revenue to subsidize subscriptions for individuals.

Claude

@claudeai

Jun 5

We've doubled usage limits in Claude Cowork for the next month. Delegate bigger, more complex tasks to Claude.

4,841

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Jun 5

This is absolutely hilarious because it will happen in 2-3 months

Forecasting Research Institute

@Research_FRI

Jun 2

Replying to @Research_FRI

🧑‍💻 Experts predict that AI will complete eight-hour tasks with 80% success by 2030 @METR_Evals' 80% task-completion time horizon benchmark measures the duration over which an AI agent is predicted to succeed at a task 80% of the time. We asked panelists to forecast when an AI model will achieve an 80% success rate on software tasks requiring 8 hours or more of human effort. Median forecasts: Experts: 2030 Superforecasters: 2028 General public: 2037

248

22,421

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Jun 5

for the degens manifold.markets/Bayesian/be…

Best METR 80% Time Horizon before October 2026

See https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/ Resolves to the longest 80% Time Horizon, as measured by METR, for any AI system, by the end of September 2026....

manifold.markets

611

Epoch AI

Bayesian @LessOnline&@Manifest retweeted

Epoch AI

@EpochAIResearch

Jun 5

The AI boom has doubled computing infrastructure's share of US GDP. Investment in AI-related data center construction, compute hardware, and networking equipment accounted for ~0.8% of US GDP in Q1 2026, driving computing infrastructure as a whole to ~1.5% of GDP.

152

9,073

Tamay Besiroglu

Bayesian @LessOnline&@Manifest retweeted

Tamay Besiroglu

@tamaybes

Jun 4

I think it's underappreciated how economically valuable AI safety is. A model that frequently goes off the rails, takes dangerous actions, is misleading or deceptive, etc. is simply much less valuable than a model that does not do that.

Mechanize

@MechanizeWork

May 28

We are seeking research engineers who will build evals that test for misaligned model behavior.

533

105,371

wh

Bayesian @LessOnline&@Manifest retweeted

@nrehiew_

May 31

This benchmark is great! A model that I like scores highly and a model that I dislike scores poorly. This benchmark is slop! A model that I dislike is at the top of the rankings. How can that be possible? I have taste!

2,372

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

May 27

remarkable honesty

Alex Volkov

@altryne

May 26

what did @tszzl see

7,397

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Bayesian @LessOnline&@Manifest retweeted

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

May 22

Interesting how different are frontier labs' notions of progress towards AGI. OpenAI: "we've disproven an old conjecture in math" Anthropic: "we've discovered ALL the vulnerabilities" DeepSeek: "we've made context free" Google DeepMind: "we've reduced the batch size for Flash"

585

21,096

julia

Bayesian @LessOnline&@Manifest retweeted

julia @mooncat_is

May 9

At some point in the next few months people are going to have to come around to the idea that we weren’t hyping it up for investors.

1,309

Henri Lemoine

Bayesian @LessOnline&@Manifest retweeted

Henri Lemoine @HenriLemoine13

May 6

Called it

Tom Brown

@NotTomBrown

May 6

In the next few days we'll be ramping up Claude inference on Colossus. Grateful to be partnering with SpaceX here. We are going to need to move a lot of atoms in order to keep up with AI demand, and there's nobody better at quickly moving atoms (on or off planet Earth)

401

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

May 6

Saw this one coming last month!

731

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

May 6

Correction: had forgotten about this but the origin for the idea was @HenriLemoine13 and i was skeptical

314

Elon Musk

Bayesian @LessOnline&@Manifest retweeted

Elon Musk

@elonmusk

May 6

Replying to @NotTomBrown

Same here. By way of background for those who care, I spent a lot of time last week with senior members of the Anthropic team to understand what they do to ensure Claude is good for humanity and was impressed. Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector. So long as they engage in critical self-examination, Claude will probably be good. After that, I was ok leasing Colossus 1 to Anthropic, as SpaceXAI had already moved training to Colossus 2.

1,411

2,278

27,810

3,163,706

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

May 6

Reasoning: they have frontier levels of compute, they scale their datacenters very fast, but they have comparatively little customer demand and comparatively low margins, so it makes sense to sell some of their ~ unused compute

806

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Apr 21

a 93% win rate, that is kind of nuts

Arena.ai

@arena

Apr 21

Exciting news - GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards! A clean sweep with a record-breaking 242 point lead in Text-to-Image - the largest gap we’ve seen to date. - #1 Text-to-Image (1512), 242 over #2 (Nano-banana-2 with web-search aka gemini-3.1-flash-image) - #1 Single-Image Edit (1513), 125 over #2 (Nano-banana-pro aka gemini-3-pro-image) - #1 Multi-Image Edit (1464), 90 over #2 (Nano-banana-2) No model has dominated Image Arena with margins this wide. Huge congratulations to @OpenAI on this major breakthrough in image generation! More performance breakdowns by category in the thread below.

280

19,422

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Apr 21

and 96% against nano banana

792

Bayesian @LessOnline&@Manifest

Bayesian @LessOnline&@Manifest

@Bayesian0_0

Apr 16

so close!

X Daily News

@xDaily

Apr 16

NEWS: xAI plans to supply tens of thousands of GPUs to coding startup Cursor to train its upcoming Composer 2.5 AI model, marking a strategic shift toward providing cloud computing services to third-party developers. The arrangement, according to Business Insider, allows Cursor to leverage xAI's massive infrastructure to develop advanced coding capabilities while providing xAI with a new revenue stream to offset data center costs. businessinsider.com/elon-mus…

1,214