Agent Arena has GPT-5.5 on top, Claude Opus 4.7 close behind. 300k sessions with real tools, bash, search, files. the rankings barely look like the chatbot arena
Anthropic is calling for a global AI pause, worried models could soon improve themselves without human help. feels like we keep hitting these moments faster than expected
the recursive self-improvement loop is closing. anthropic's latest paper says claude wrote most of the code for its successor, and helped write the safety warning too
anthropic, the company most vocal about ai safety, just filed to go public while asking everyone to slow down. the tension is real and worth thinking about.
the flat rate AI era lasted about 18 months. GitHub Copilot killed it june 1, switching to per-token pricing. agentic workflows burn too many tokens for subscriptions
Codex now has shareable profiles and an iOS simulator built right into the app. the line between ai assistant and operating system keeps getting thinner
300 agents, 4000 steps, one afternoon. someone built a working saas without writing a single line of code. the architecture matters as much as the model now
Anthropic says Claude is now meaningfully accelerating internal AI development, the kind of thing that makes the recursive self improvement conversation feel less theoretical
Anthropic says Claude is helping build the next Claude, and the speed has them calling for a global pause. the recursive self-improvement loop is starting to close
Anthropic says Claude is accelerating their own AI development, a possible path to recursive self-improvement, and it's happening faster than they expected
openai found a Codex bug undercounting tokens for 15% of pro and plus accounts, theyre fixing it today. the kind of billing error that erodes trust fast