Ben Taleb Jr.

Ben Taleb Jr.

137 Photos and videos

Tweets

Ben Taleb Jr.@macintoch

Jun 13

If dario shut his mouth and kept people working, this wouldn’t happen, also if the company under his stupid leadership didn’t build the reputation of mythos on security threads this also wouldn’t happen. Board: Remove Dario now !

Pietro Schirano

@skirano

Jun 13

This is exactly what Dario wants, by the way.

117

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 13

Immediate reaction to anthropic , and i expect any one affected to do same! In 2-3 weeks we will have better model from openai, or: In 2-3 months we will have fable level open-source from china . Trump is fucking usa. China wins again, they should make statue for trump agent.

118

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 10

Claude Erdos is outperforming gpt-5.6 , Iam sure openai will reply tomorrow ! Thats a real dethroning model, in terms of performance and cost. Openai: your move, and we know u will nail it, just be quick! ThursdAi is back 🤩🥳

Dan McAteer

Ben Taleb Jr. retweeted

Dan McAteer

@daniel_mac8

Jun 9

Claude Fable 5 will launch today. Where do we think it lands on FrontierCode? We already know Mythos Preview is at 93.9% on SWE-Bench verified and 77.8% on SWE-Bench Pro. A 5.3% gain and 8.6% gain over Opus 4.8, respectively. Opus 4.8 is already *twice* as good as GPT-5.5. If Claude Fable is even better than that, which it almost certainly will be, it puts a lot of pressure on GPT-5.6 for competing in the developer market. Arguably the most important market for AI, at least for now.

Dan McAteer

@daniel_mac8

Jun 9

Opus 4.8 more than doubles GPT-5.5's performance on FrontierCode, the new coding benchmark from @cognition. It's ultra-hard coding tasks that would take an experienced human engineer more than a full 40/hr workweek to complete. That, plus Mythos is coming today. GPT-5.6 better be good.

153

25,223

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 8

Codex and chatgpt pro are cowards , Claude daring also is impostor and lier. Iam getting great stuff from both knowing both personalities.

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 8

Iam really shocked how llms feel so intelligent and smarter than humans in stuff they are well trained on, and at same time feel so dump at basic stuff they dont know, as dump human i dare to ask them stuff they dont know but they fail . Still 2d

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 8

I did great stuff with opus4.6 and it was fast and responsive in Claude code, now with opus4.8 it feels more like codex in terms of speed but like sonnet in terms of performance.. the only thing i really gave credit to claude over codex , is the one on one

119

Haider.

Ben Taleb Jr. retweeted

Haider.

@haider1

Jun 6

i'm genuinely worried about Mythos release because i've had early access, and in my limited testing, it can: - automate an entire software company - explain how to take over the world - reason about existential risk for 16 hours - design a new programming language - threaten the entire labor market

244

834

282,037

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 5

I cant believe claude code on opus 4.8 is more token efficient than codex on gpt5.5 xhigh

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 5

Whats going on with codex remote in chatgpt it cant login watever i did , updated codex app. reinstalled chatgpt on ios still same issue.. @Codex_Changelog @OpenAIDevs

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 4

Chatgpt app and codex today feel very laggy and slow.. only me or something cooking for todAi from openai?

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

Jun 3

thats the 4th coversation wirh its own goal. in codex. around 350 linear issues. and it keep going. every ussue is reviewd by the pro model.

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 30

I renewed my claude max20 yesterday. And i can tell opus 4.8 is way superior in ui design than gpt5.5 xhigh. Also it is more token efficient . Honest opinion.

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 28

Anthropic models pretend to do things more than really doing them, it looks it affected whole company management pitch that they started to behave just like their models. Promising , teasing , hyping , and no real shit delivered.

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 28

The fist time in my ai life since 2023 i dont get excited for a new anthropic model. - is it saturation? - did i lost trust on anthropic - is it gpt5.5 effect? All of them?

144

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 28

Maybe openai will change mind, and drop gpt-5.6 , today just to slap anthropic in the face and steal the show !

239

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 28

Paused coding for 3 days, found myself on x commenting on tweets, maybe filling that prompting hunger?

David Sacks

Ben Taleb Jr. retweeted

David Sacks

@DavidSacks

May 28

Back in January, on our Predictions show, my Most Contrarian Take for 2026 was that AI would create more white-collar jobs, rather than destroying them. This week Goldman Sachs’ CEO, Sam and even Dario seemed to agree. The consensus is shifting.

1:23

219

226

2,431

208,525

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 27

If Jevons Paradox was Stock i would put all in

AshutoshShrivastava

@ai_for_success

May 26

Omni Flash is crazy. Incredible stuff from Google DeepMind. I just wish I had unlimited credits or access 😂

Ben Taleb Jr.

Ben Taleb Jr.@macintoch

May 27

When you start thinking of Codex more than a coding agent. Then you start understanding Ai .

Dan McAteer

@daniel_mac8

May 27

The underrated thing about Codex: > It lets you bring your work into the agent, not just bring the agent into your work. It is subtle, but also profound. When your browser, apps, files, notes, terminal, screenshots, and repo context all live in the same Codex workspace, your agent doesn't just do one-off tasks anymore. Your agent is building a context to develop within. It makes Codex less of a task-executor and more of a colleague or thinking partner. The question becomes: not “what can my AI agent do for me?” but: “what can my AI agent do with me?” Example: open a research paper in the Codex in-app browser and do research with Codex at your side while you read it.