Benjamin Polge

Benjamin Polge

110 Photos and videos

Tweets

Benjamin Polge

@BenjaminPolge

Was at @OpenAI Paris event today. The real story isn’t the Codex roadmap, it’s that OpenAI is already teasing its next frontier model, landing “in a few weeks.” 🧵

more replies

Benjamin Polge

Benjamin Polge

@BenjaminPolge

On product: the throughline is Codex absorbing more of ChatGPT’s territory. OpenAI’s own framing is Codex functionality flowing into ChatGPT, not the two apps fully merging.

Benjamin Polge

Benjamin Polge

@BenjaminPolge

x.com/openaidevs/status/2066…

OpenAI Developers

@OpenAIDevs

More of Codex is rolling out across Europe this week. We’re bringing Computer use, the Codex Chrome extension, personalized memory, and Chronicle to Codex users in the EEA, UK, and Switzerland. developers.openai.com/codex/…

Benjamin Polge

Benjamin Polge

@BenjaminPolge

Replying to @steipete

My read: a Codex-only surface likely survives for agent orchestration, since the standalone app is built around managing multiple agents over time. That’s my inference from the framing, not something confirmed on stage

Benjamin Polge

Benjamin Polge

@BenjaminPolge

Net read: VivaTech is this week’s visible event, but OpenAI’s real signal is already pointed at the next model.

OpenAI Developers

Benjamin Polge retweeted

OpenAI Developers

@OpenAIDevs

Changelog – Codex | OpenAI Developers

Latest updates to Codex, OpenAI’s coding agent

developers.openai.com

166

134

2,211

145,975

Cointelegraph

Benjamin Polge retweeted

Cointelegraph

@Cointelegraph

13h

🤖 TECH: Researchers at ETH Zurich have built a quadruped robot that can track, predict, and return badminton shuttlecocks against human opponents.

0:23

280

48,197

Wes Roth

Benjamin Polge retweeted

Wes Roth

@WesRoth

23h

Google Research introduced Gemini-SQL2, a new text-to-SQL system powered by Gemini 3.1 Pro. It converts questions written in everyday language into executable SQL queries, allowing users to retrieve information from databases without manually writing the code. Gemini-SQL2 achieved state-of-the-art performance on the BIRD benchmark, reportedly reaching 80.04% execution accuracy. Unlike evaluations that only check whether generated SQL resembles a reference answer, BIRD runs the query and verifies whether it returns the correct result. The technology could eventually strengthen natural-language access across Google’s data and analytics services.

Google Research

@GoogleResearch

Jun 12

🚀 Introducing Gemini-SQL2, our breakthrough text-to-SQL capability powered by Gemini 3.1 Pro! We've achieved state-of-the-art results on the highly competitive BIRD benchmark, translating natural language into execution-ready SQL queries. 🧵👇

3,307

Wes Roth

Benjamin Polge retweeted

Wes Roth

@WesRoth

14h

ByteDance released Seedance 2.0 Mini, a faster and lower-cost version of its Seedance video-generation model. The model can create videos using image, video, and audio references, making it suitable for everyday production workflows. Seedance 2.0 Mini is priced at roughly 50% of Seedance 2.0’s listed cost, offering a more accessible option for creators and teams producing videos at higher volume.

BytePlus

@BytePlusGlobal

Jun 15

ByteDance's Seedance 2.0 Mini is live. A faster, more accessible way to create videos from image, video, and audio references. Built for everyday production workflows at about 50% of Seedance 2.0's list price.

2,867

Unsloth AI

Benjamin Polge retweeted

Unsloth AI

@UnslothAI

Jun 15

You can now run Kimi K2.7 Code locally! 🌘 We shrank the 1T model to 325GB (-48%) via Dynamic 2-bit where important layers are upcasted. Run at >40 tok/s on 330GB RAM/VRAM setups. Run full precision on 610 GB. Guide: unsloth.ai/docs/models/kimi-… GGUF: huggingface.co/unsloth/Kimi-…

Kimi.ai

@Kimi_Moonshot

Jun 12

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: 21.8% on Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

149

277

2,668

1,147,505

Aron Prins

Benjamin Polge retweeted

Aron Prins

@aronprins

23h

Breaking News: Claude is pausing the Agent SDK credit change!

920

380,046

Alex Atallah

Benjamin Polge retweeted

Alex Atallah

@alexatallah

Jun 13

We just announced our Fusion API: - Fable-level performance on deep research tasks, at half the cost - Better-than-SOTA performance using panels The future of AI is neurodiversity, not single-model takeovers.

OpenRouter

@OpenRouter

Jun 13

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

1,140

144,187

OpenRouter

Benjamin Polge retweeted

OpenRouter

@OpenRouter

Jun 13

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

699

1,757

14,782

5,929,457

Robin Ebers · AI for Business Owners

Benjamin Polge retweeted

Robin Ebers · AI for Business Owners

@robinebers

Jun 13

BREAKING: ANTHROPIC IS OFFERING REFUNDS UNTIL JUNE 20TH

230

2,885

398,461

OpenAI

Benjamin Polge retweeted

OpenAI

@OpenAI

Jun 12

We heard you wanted to use Codex rate limit resets on your own time. Starting today, we’re rolling out the ability to save rate limit resets to use later. We’re starting Go, Plus, Pro, and Business users with one free reset:

0:28

1,345

1,747

21,677

4,347,825

ClaudeDevs

Benjamin Polge retweeted

ClaudeDevs

@ClaudeDevs

Jun 10

New for Apple developers: Foundation Models support for Claude lets developers use Apple's Foundation Models framework to call Claude for multi-step reasoning, code generation, and longer context.

234

3,550

290,435

SemiAnalysis

Benjamin Polge retweeted

SemiAnalysis

@SemiAnalysis_

Jun 10

Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)

198

584

6,099

3,509,619

Dario Amodei

Benjamin Polge retweeted

Dario Amodei

@DarioAmodei

Jun 10

Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: darioamodei.com/post/policy-…

Dario Amodei — Policy on the AI Exponential

darioamodei.com

1,342

2,432

13,558

6,516,471