FAILURE OF IMAGINATION—Sure, everybody is clearly racing for recursive superintelligence right now. And the path forward seems so obvious: just automate software engineering, ML engineering and research, and hook it back up to itself. But if automating coding was so obvious, why didn't people go at it from the get-go?
If you go and read the model card from Claude 2, for example, it's mostly about chat assistant stuff, like helpfulness, red-teaming, translation, etc. The word "coding" appears just four times, while "translation" appears twelve times! Claude 3 was more about multimodality, long-context document processing, Q&A, writing, etc. It's only around Claude Sonnet 3.5 on June 20, 2024 that model card focus shifts to agentic coding.
It's not as if coding was some niche use-case: GitHub Copilot came out in 2021. Indeed, OpenAI's first product named "Codex" was a GPT-3 finetuned for code.
Ok, maybe the objection is that everybody knew that automating coding was going to be a big deal, but that the models just weren't good enough, or that we really needed RLVR to make it work. Cursor, which was the fastest product to $100M ARR just two years ago, had been around for years and didn't go vertical until Sonnet 3.5 came out.
But then why did the labs spend so much time on a bunch of different side projects that did not help them get to automating coding? If you go back in time to 2024 and tell researchers, by the way, by the end of 2026 you won't be coding anymore, just texting the chatbot on your phone, oh, and agentic coding will be $50! billion! of ARR (remember that anthropic's valuation was under $20B in 2024), do you really think they would spend any time working on: voice models; video models; deep research; browsers (ChatGPT Atlas released seven months ago and you have already forgotten about it).
The term "AGI-pilled" comes up a lot. Do you really believe in AGI, do you really understand AGI, and so on. But even the people who are the most AGI pilled at any given time do not fully grasp the full potential of the technology or where it should head; it is largely by stumbling and not by planning that prospecting for gold has succeeded. You should really just think of AGI-pilled as believing that there is a really big "there" there, to the consternation of everybody else, but even AGI's biggest believers continuing to underestimate just how big it really is.
Because the broad contours of the most audacious beliefs and predictions of the AGI-pilled have come to pass (we await now the IPOs of two trillion-dollar companies, do we not?) people tend to over-estimate the certainty and accuracy of predictions from back then. But much of what has unfolded was not obvious in foresight. Again, if it was, then some very smart, highly motivated people would have made different decisions.
Look, at some point in time, there were two, maybe three people on the planet who believed in scaling: Ilya, Dario, a few others. Not even Sam, who has bought more compute at this point than the GDP of medium-sized countries, believed in scaling back then. Not even Alec, who did the first GPT paper! That was a long time ago. Then more and more people began to believe in scaling. People used "-pilled" as a suffix for scaling then too. I think the difference between really being RSI-pilled and scaling-pilled is that we are now in the regime where