Building enterprise super intelligence

Joined November 2024
67 Photos and videos
Pinned Tweet
The context problem in AI is much deeper than it looks. Feeding an LLM more documents doesn't solve it. Context isn't just information. It's understanding built from experience over time. The only real fix is continual learning: AI that keeps updating from the environment it operates in, rather than freezing at training time. At Skyfall, this has shaped our research roadmap from the start. First, we explored world models as planners (worth reading Dr. Fei-Fei Li's recent breakdown on this). Our work on SCOPE showed that a small, specialized world model could outperform frontier LLMs on sequential decision-making by being 55x faster than GPT-3.5, 160,000x smaller than GPT-4o, and more accurate on planning tasks. Now we're focused on world models as simulators. A simulator isn't just a rendering of the world. It's a physically and dynamically faithful environment that agents can actually train in. And for a simulator to stay useful over time, it needs continual learning at its core. Agents that train in a static simulation will eventually hit a ceiling. Agents that train in a world that evolves with them won't. Context is a learning problem. That's what we're building toward, and we're launching something soon. Stay tuned. ✨
1
4
273
We're proud to welcome Arjun Mohan (@arjunmhn) as Chief of Staff at Skyfall AI, based in our San Francisco office! Arjun brings deep operational and financial expertise from four years at Barclays in private markets and private wealth management, where he built financial systems, raised capital, and advised family offices on their private market investments. He's a @Wharton school MBA (Mayer Fellow) with an Economics degree from @uniofwarwick . During his time at Wharton, he interned at Superscript, a Series A healthcare startup, where he worked across product deployment, fundraising and strategic finance. Welcome aboard Arjun!
1
4
440
Do you enjoy being a small frog in a big pond? 🐸 Do you love your life doing RL fine tuning and hyper parameter optimization? 🚨 Hiring Alert Torontonians! We're hiring a Research Engineer who can help us scale world models and not just be a small frog in a big pond in OpenAI or Anthropic. Scaling World Models in 2026 is like Scaling LLMs in 2019 - you are a pioneer and trendsetter. Do you want to become the next Alec Radford or do you want to twiddle your thumbs in OpenAI and Anthropic? Skyfall AI is on a mission to build the future of autonomous enterprises, and we're looking for a Research Engineer (ML) to join our cutting-edge AI research team. You’ll play a key role in developing AI training infrastructure, pushing the boundaries of World Models, LLMs and RL, and contributing to the broader research community through publications and open-source projects. If you have a Master's in Computer Science or related field and have real industry experience in LLMs, and the drive and the passion to grind it out in a startup to build the next big thing then feel free to apply. JD and link to apply below:
1
126
Skyfall AI retweeted
Language models gave AI the ability to talk about the world. World models will give AI the ability to understand it. But “world model” is an overloaded term. What does it mean? HAI Founding Director @drfeifei offers the taxonomy that matters now. brnw.ch/21x36eY

5
11
35
2,109
LLMs taught us what AI could do with language, but language isn't enough. The next layer is world models. Systems that don't just predict text but understand space, causality, and consequence. Systems that know what happens when something changes, not just how to describe it. ➡️That's what we're building at Skyfall.
4
6
168
For those who want to go deeper, here's our research so far: - LLMs playing OpenRCT2: skyfall.ai/blog/claude-gpt-a… - LLMs and Enterpise Workflows: skyfall.ai/blog/wow-bridging… - SCOPE, neural planner: skyfall.ai/blog/scope-hierar… - CASSANDRA: skyfall.ai/blog/pioneering-s… - AI CEO: skyfall.ai/blog/building-the…

2
79
Skyfall AI retweeted
Why LLMs are a dead end for human-level intelligence, and especially for Physical AI / Robotics. The next leap isn’t bigger language models. It’s World Models. I just dropped a full 1-hour presentation from Shanghai: “World Models: the ChatGPT moment for robotics?” → Why LLMs hit a wall → Why action-conditioned world models planning in latent space are the real path → Live World Forge demo with LeWorldModel Hugging Face LeRobot Watch here. The future of intelligence is embodied, not just chatty.
21
44
288
33,965
Dr. Fei-Fei Li outlines three functions of a world model in her recent blog: renderer, simulator, planner. It's a useful taxonomy and one that maps closely to our research roadmap at Skyfall. Our work on SCOPE earlier this year addressed the planner: can a small, specialized world model outperform frontier LLMs on sequential decision-making tasks? On TextCraft, it did. - 55x faster than GPT-3.5 (3s vs 164s) - 160,000x smaller than GPT-4o (11M vs 1.8T parameters) - Higher planning accuracy (56%) than frontier models The planner was step one but the harder problem is the simulator. That's what we've been cooking. More soon. 👀
1
109
Skyfall AI retweeted
May 13
This is the single best read on World Models and one of the most important reads in AI. $10B has flowed into "world models" in the last 18mos, from Yann LeCun to FeiFei Li. The promise is, like LLMs, world models will provide the data it takes to scale robotics foundation models, and solve robotics. ..but the word has been abused to mean one of many things. This post unpacks: – What 5 traits makes a world model? – How do the different approaches stack up? – What is it used for within and beyond robotics? – Where is the opportunity? – Citations to research, news and blog posts Companies / products in the space include: – BigCo products: Google Genie, Tesla Optimus, Nvidia DreamDojo, DreamZero, Microsoft Muse – Pure world model: AMI Labs, World Labs, Runway, Rhoda, Decart, Spaitial, Odyssey, Embo, Dream Labs, OneWorld – Robot foundation model cos: Skild, Physical Intelligence, Figure, Mind Very likely one of the seminal technologies of the next decade.
56
152
1,031
133,634
Skyfall AI retweeted
microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale. this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab. the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale let's look at all of this in this likely very long thread 🧵
Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier. First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks. - It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities. - It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks. - And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end. Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2 on the leaderboards, surpassing the score of Nano Banana 2 on image editing. Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI. - Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost. All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat. Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost. Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare. Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog: microsoft.ai/news/building-a…
42
267
2,087
284,272
Skyfall AI retweeted
There will be no AI jobpocalypse. The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — does affect jobs, but telling overblown stories of large-scale unemployment is irresponsible and damaging. Let’s put a stop to it. I’ve expressed skepticism about the jobpocalypse in previous posts. I’m glad to see that the popular press is now pushing back on this narrative. The image below features some recent headlines. Software engineering is the sector most affected by AI tools, as coding agents race ahead. Yet hiring of software engineers remains strong! So while there are examples of AI taking away jobs, the trends strongly suggest the net job creation is vastly greater than the job destruction — just like earlier waves of technology. Further, despite all the exciting progress in AI, the U.S. unemployment rate remains a healthy 4.3%. Why is the AI jobpocalypse narrative so popular? For one thing, frontier AI labs have a strong incentive to tell stories that make AI technology sound more powerful. At their most extreme, they promote science-fiction scenarios of AI “taking over” and causing human extinction. If a technology can replace many employees, surely that technology must be very valuable! Also, a lot of SaaS software companies charge around $100-$1000 per user/year. But if an AI company can replace an employee who makes $100,000 — or make them 50% more productive — then charging even $10,000 starts to look reasonable. By anchoring not to typical SaaS prices but to salaries of employees, AI companies can charge a lot more. Additionally, businesses have a strong incentive to talk about layoffs as if they were caused by AI. After all, talking about how they’re using AI to be far more productive with fewer staff makes them look smart. This is a better message than admitting they overhired during the pandemic when capital was abundant due to low interest rates and a massive government financial stimulus. To be clear, I recognize that AI is causing a lot of people’s work to change. This is hard. This is stressful. (And to some, it can be fun.) I empathize with everyone affected. At the same time, this is very different from predicting a collapse of the job market. Societies are capable of telling themselves stories for years that have little basis in reality and lead to poor society-wide decision making. For example, fears over nuclear plant safety led to under-investment in nuclear power. Fears of the “population bomb” in the 1960s led countries to implement harsh policies to reduce their populations. And worries about dietary fat led governments to promote unhealthy high-sugar diets for decades. Now that mainstream media is openly skeptical about the jobpocalypse, I hope these stories will start to lose their teeth (much like fears of AI-driven human extinction have). Contrary to the predictions of an AI jobpocalypse, I predict the opposite: There will be an AI jobapalooza! AI will lead to a lot more good AI engineering jobs, and I’m also optimistic about the future of the overall job market. What AI engineers do will be different from traditional software engineering, and many of these jobs will be in businesses other than traditional large employers of developers. In non-AI roles, too, the skills needed will change because of AI. That makes this a good time to encourage more people to become proficient in AI, and make sure they’re ready for the different but plentiful jobs of the future! [Original text in The Batch newsletter.]
593
1,222
5,404
808,631
Skyfall AI retweeted
Jensen Huang just said this is the greatest era in history to build software. AI agents will not kill software. They will do the exact opposite: create a massive new wave of software demand. at NVIDIA GTC Taipei 2026 "Click and type. We now replace that with explaining to the AI what we want, our intent, and the AI generates the code or uses tools to produce the necessary output. This is how computers are going to work in the future. This is Agentic AI. For two years, we've been building toward this, and now it has arrived. One of the big breakthroughs, of course, is tool use. A lot of people have said, “Jensen, AI is coming. Agentic AI is coming. Therefore, all the software companies are going to go out of business.” This is exactly the opposite. Because there are going to be so many agents, the world is no longer limited by the number of people. Therefore, those agents are going to use more tools than ever. This is actually an incredible time to be a software company. But the software has to be presented to the agent in a way that the agent can use it. This is a big breakthrough. And in fact, what we have done, as you know, what Nvidia’s treasure..." ---- From 'NVIDIA' YT channel (link in comment)
22
32
135
16,581
Skyfall AI retweeted
Sam Altman reveals that OpenAI’s top “token leader” uses 100B tokens every month, and still falls short of the world’s highest user. Source: axios --- axios. com/2026/06/02/altman-openai-top-token-user
3
6
53
6,110
Skyfall AI retweeted
LLMs learn by predicting tokens. World models (JEPA, data2vec) learn by predicting their own abstractions. Which needs more data? For data with hidden hierarchy, we prove the gap is exponential. arxiv.org/pdf/2605.27734
35
227
1,618
146,117
Skyfall AI retweeted

168
961
4,571
989,965
🚀 We're hiring our first Chief of Staff at @skyfallai. You will work directly with our CEO @spisallyouneed. This is a rare opportunity to help shape the trajectory of the company at one of the most exciting moments in AI. You'll sit at the heart of everything, from driving execution, cutting through complexity to making sure we move fast on what matters most. If you're ambitious, sharp, and ready to operate at the highest level, this role was made for you. 📌 must be based in SF or willing to relocate. 🔗link to apply in comments #hiring #chiefofstaff
1
2
359