AI-pilled | AI Analyst

Joined February 2015
110 Photos and videos
Was at @OpenAI Paris event today. The real story isn’t the Codex roadmap, it’s that OpenAI is already teasing its next frontier model, landing “in a few weeks.” 🧵
1
24
On product: the throughline is Codex absorbing more of ChatGPT’s territory. OpenAI’s own framing is Codex functionality flowing into ChatGPT, not the two apps fully merging.
2
20

More of Codex is rolling out across Europe this week. We’re bringing Computer use, the Codex Chrome extension, personalized memory, and Chronicle to Codex users in the EEA, UK, and Switzerland. developers.openai.com/codex/…
16
Replying to @steipete
My read: a Codex-only surface likely survives for agent orchestration, since the standalone app is built around managing multiple agents over time. That’s my inference from the framing, not something confirmed on stage
1
15
Net read: VivaTech is this week’s visible event, but OpenAI’s real signal is already pointed at the next model.
9
Benjamin Polge retweeted
More of Codex is rolling out across Europe this week. We’re bringing Computer use, the Codex Chrome extension, personalized memory, and Chronicle to Codex users in the EEA, UK, and Switzerland. developers.openai.com/codex/…
166
134
2,211
145,975
Benjamin Polge retweeted
🤖 TECH: Researchers at ETH Zurich have built a quadruped robot that can track, predict, and return badminton shuttlecocks against human opponents.

30
34
280
48,197
Benjamin Polge retweeted
Google Research introduced Gemini-SQL2, a new text-to-SQL system powered by Gemini 3.1 Pro. It converts questions written in everyday language into executable SQL queries, allowing users to retrieve information from databases without manually writing the code. Gemini-SQL2 achieved state-of-the-art performance on the BIRD benchmark, reportedly reaching 80.04% execution accuracy. Unlike evaluations that only check whether generated SQL resembles a reference answer, BIRD runs the query and verifies whether it returns the correct result. The technology could eventually strengthen natural-language access across Google’s data and analytics services.
🚀 Introducing Gemini-SQL2, our breakthrough text-to-SQL capability powered by Gemini 3.1 Pro! We've achieved state-of-the-art results on the highly competitive BIRD benchmark, translating natural language into execution-ready SQL queries. 🧵👇
4
9
56
3,307
Benjamin Polge retweeted
ByteDance released Seedance 2.0 Mini, a faster and lower-cost version of its Seedance video-generation model. The model can create videos using image, video, and audio references, making it suitable for everyday production workflows. Seedance 2.0 Mini is priced at roughly 50% of Seedance 2.0’s listed cost, offering a more accessible option for creators and teams producing videos at higher volume.
ByteDance's Seedance 2.0 Mini is live. A faster, more accessible way to create videos from image, video, and audio references. Built for everyday production workflows at about 50% of Seedance 2.0's list price.
4
5
60
2,867
Benjamin Polge retweeted
You can now run Kimi K2.7 Code locally! 🌘 We shrank the 1T model to 325GB (-48%) via Dynamic 2-bit where important layers are upcasted. Run at >40 tok/s on 330GB RAM/VRAM setups. Run full precision on 610 GB. Guide: unsloth.ai/docs/models/kimi-… GGUF: huggingface.co/unsloth/Kimi-…
🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: 21.8% on Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai
149
277
2,668
1,147,505
Benjamin Polge retweeted
Breaking News: Claude is pausing the Agent SDK credit change!
67
71
920
380,046
Benjamin Polge retweeted
We just announced our Fusion API: - Fable-level performance on deep research tasks, at half the cost - Better-than-SOTA performance using panels The future of AI is neurodiversity, not single-model takeovers.
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
97
85
1,140
144,187
Benjamin Polge retweeted
Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇
699
1,757
14,782
5,929,457
Benjamin Polge retweeted
BREAKING: ANTHROPIC IS OFFERING REFUNDS UNTIL JUNE 20TH
97
230
2,885
398,461
Benjamin Polge retweeted
Jun 12
We heard you wanted to use Codex rate limit resets on your own time. Starting today, we’re rolling out the ability to save rate limit resets to use later. We’re starting Go, Plus, Pro, and Business users with one free reset:
1,345
1,747
21,677
4,347,825
Benjamin Polge retweeted
New for Apple developers: Foundation Models support for Claude lets developers use Apple's Foundation Models framework to call Claude for multi-step reasoning, code generation, and longer context.
93
234
3,550
290,435
Benjamin Polge retweeted
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)
198
584
6,099
3,509,619
Benjamin Polge retweeted
Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: darioamodei.com/post/policy-…
1,342
2,432
13,558
6,516,471