A research paper inspired by a Beatles song name and a scientist who never lost faith in Essential AI. This is the story of "Attention Is All You Need" and the billion-dollar startup it helped create.
@essential_ai
Nearly a decade ago a group of eight Google researchers published “Attention Is All You Need.” At the time, it looked like another important machine learning paper, which is a foundation for natural language processing, not necessarily the paper that would end up reshaping the entire AI field. It’s core idea, the Transformer, was technical, but the impact became very practical. The same architecture later became the foundation for systems such as ChatGPT, Gemini, and Claude. Years later, the paper has surpassed 200,000 citations, underscoring its deep influence on modern AI research.
The Transformer is the 'T' in GPT and helped drive huge gains for companies like NVIDIA, Google, and Microsoft. When Vaswani presented his research, the audience broke out in enthusiastic critical acclaim.
More than 2,200 people packed the auditorium, while 10,000 more watched online. Lines snaked around the second floor of the San Jose Convention Center. For a tech conference, it was the closest thing to Beatlemania—fitting, given the paper's inspiration.
All eight authors have since left Google. Between them, they've founded seven companies, several of which are now valued at over $1 billion. The transformer paper became their golden ticket, but Vaswani always had a unique vision for AI.
Vaswani was not convinced that simply scaling Transformers would be enough. Larger language models had already shown real performance, but he seemed to believe the field would eventually need more than size alone. That view was part of what brought him and Niki Parmar together to launch Essential AI in 2023.
He also changed course during fundraising. At a point, Vaswani told investors that Essential AI would move away from enterprise tools and put more attention on open, foundational research. It was not the original pitch, but March Capital still stayed with him.
Vaswani isn’t losing sleep over AGI taking over the world. What troubles him is how the rush for AI breakthroughs could be squeezing the life out of real science. Right now, a few labs keep pouring money into the same old transformer playbook, making it tough for independent thinkers to get a foot in the door. "A handful of companies control the production, pace, and flow of advanced AI," he wrote in Essential’s manifesto.
The reaction to GPT-5 seemed to reinforce that concern. Many people saw it as an improvement, but not a major break from the current path. To Vaswani that raised the same question again: how far can the field go by making models larger and training them on more compute? His view is that the next major step in AI may have to come from a different kind of idea, not just another round of scaling.
Early results from Essential show that their models can start self-correcting much earlier in training, without the expensive fine-tuning usually needed afterward. If this holds up, building advanced AI could get much cheaper, making it possible for smaller teams to compete with industry leaders.
By August 2025 Essential AI had raised a $175 million Series B round from Lightspeed Venture Partners and Thrive Capital. The round reportedly valued the company at about $1 billion. For Vaswani the milestone carried extra weight because of his role in the original Transformer paper. It also showed that investors were willing to bet on his view that AI still has room for a more fundamental breakthrough.
There is a certain irony: The man whose paper ignited the trillion-dollar scaling arms race is now the loudest voice questioning whether "attention is all you need" anymore. "If you took a quieter approach to work, I think it might lead to healthier attitudes," he says now.
Vaswani refuses to let ChatGPT write for him. His San Francisco lab features portraits of Alan Turing, Ada Lovelace, Claude Shannon, and Jagadish Chandra Bose. The next transformer breakthrough won't announce itself—it will come from someone who stops scaling and starts thinking.
Vaswani is betting that Essential AI is that room.
#BigData #Analytics #DataScience #AI #MachineLearning #NLProc #LLM #IoT #IIoT #PyTorch #Python #RStats #TensorFlow #Java #JavaScript #ReactJS #GoLang #CloudComputing #Serverless #DataScientist #Linux #Programming #Coding #100DaysofCode
geni.us/Attention-All-You-Ne…