Andrej Karpathy

Andrej Karpathy

277 Photos and videos

Tweets

lukex retweeted

Andrej Karpathy

@karpathy

Jun 9

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

Claude

@claudeai

Jun 9

Replying to @claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.

Benchmark table titled Mythos 5 & Fable 5, comparing Claude Mythos 5 and Fable 5 against Claude Mythos Preview, Claude Opus 4.8, GPT 5.5, and Gemini 3.1 Pro.

ALT Benchmark table titled Mythos 5 & Fable 5, comparing Claude Mythos 5 and Fable 5 against Claude Mythos Preview, Claude Opus 4.8, GPT 5.5, and Gemini 3.1 Pro.

1,266

2,356

25,225

2,669,416

Brian Armstrong

lukex retweeted

Brian Armstrong

@brian_armstrong

Jun 8

Good take My guess is - demand for intelligence is near infinite - but 80% of workloads will be running on 99% cheaper models within 12-18 months - 20% of workloads will still run on latest gen models where IQ maxing is important (scientific breakthroughs, higher level ochestrator agents?) - rough analogy might be what % of macbooks or gaming PCs sold have the maxed out specs for CPU/GPU, prices are falling much faster than Moore's law here though - this leads me to think the limiting factor will be energy and compute, not better models At Coinbase we're working hard on routing prompts to cheaper models where appropriate, and in some cases have been able to keep costs roughly flat, while token usage continues to grow exponentially.

Tommy

@Shaughnessy119

Jun 2

The most basic way AI could blow up imo. I'm not saying it does but this is the most obvious way I can see it happening - Per seat subscriptions are massively subsidized. The flat fee was priced way below what heavy usage actually costs - For real business use you have to move to the API anyway. Data protections, work integrations and compliance officer approval - On the API you pay metered rates, and businesses are burning credits way faster than the per seat pricing ever led them to expect - This is everywhere right now. Internally for us, Codex users, Uber torching its entire 2026 AI budget in 4 months, the Microsoft comments. Just go try an API I shared more on this here: x.com/Shaughnessy119/status/… - And I don't think most businesses have the money to keep paying increasing API rates without a real change to how they operate (caps needed) - Because they have a cheap alternative. They can reach open source models through any aggregator (OpenRouter, Venice, Baseten, Together) and still get strong privacy. Venice private data centers, or E2EE/TEE serving GLM 5.1. More on open source inference provider raises here: x.com/Shaughnessy119/status/… - And the discount is enormous. DeepSeek V4 codes within a hair of Opus on SWE bench at roughly 1/30th the price, and the cheapest open models run closer to 1/100th - Chinese labs open source frontier grade models. The model is the single biggest cost an inference provider has, and they get it for free - This idea dies if China goes closed source. That is actually bullish web2 AI labs, because if everyone is closed you pay up for the best intelligence. China goes closed source if they are tired of giving away an asset and they want the revenue and data flow to train new models - Is this showing up in web2 AI lab revenue yet? No. Revenue is off the charts. Anthropic went from 9B to 47B run rate in five months - So go forward, what happens? - I think revenue slowly starts leaking to the open source inference providers (see Venice usage, OpenRouter's $113M raise, Baseten is raising at $11B or triple its valuation in three months, on revenue that went from $200M to $600M annualized in a single quarter) - It doesnt move overnight, but it caps the labs ability to raise prices, and margins are already deeply negative. OpenAI is reportedly running near negative 122% - With margins that bad there is no cash flow, so the labs are fully dependent on outside capital to buy GPUs, train models, and keep subsidizing usage (I.e. see Google tapping $80b equity sale, granted 30b for employee RSU taxes. Clearly they think Equity is overvalued or you wouldn't sell it) - The break comes when that capital stops. Pricing is capped so margins cant improve, and the moment investors lose conviction on payback, the whole flow reverses - Why would they lose conviction on payback? Back to the start - the inability to improve margins or get businesses to pay more - This is also limiting, if we start making new drugs with AI or create entirely new businesses, you better believe people will pay up to the max for AI usage

472

614

6,603

2,795,289

Spencer Yang

lukex retweeted

Spencer Yang

@spenceryang

May 20

SpaceX may soon become one of the first companies to IPO at a $2T valuation, bringing together SpaceX, xAI, and X. I started my post-college career at Twitter. I watched the platform evolve, grow, struggle, reinvent itself, and even later worked out of its former San Francisco office after it became a co-working space run by BLK71 SF. To mark the moment, I shipped TrillionMarketCap: a live registry of the assets, companies, commodities, and networks large enough to be measured in trillions. Gold. NVIDIA. Apple. Bitcoin. SpaceX.

0:20

429

Spencer Yang

lukex retweeted

Spencer Yang

@spenceryang

May 19

x.com/i/article/205679616826…

215

Allie K. Miller

lukex retweeted

Allie K. Miller

@alliekmiller

May 13

The most expensive mistake in enterprise AI right now: treating FDEs as your whole transformation plan. Forward deployed engineers (FDEs) are important for custom deployments, but they won’t fix the change management issue most enterprises are facing. It’s likely more the former that Anthropic and OpenAI will continue to prioritize (and hire into the thousands, who knows). Beyond performance and cost, it’s systems integration, ROI, and literal usefulness that drive revenue and stickiness. *However* External FDEs, in my opinion, will not make your company an AI-first company. You can have the sleekest multi-agent orchestrations and still have the majority of your employee base hating AI, avoiding AI, and distrusting leadership decisions on AI. And we already know this because we see this in traditional SaaS too: you can customize the heck out of your Salesforce deployment, but that doesn’t mean your sales team will improve their data hygiene or even attempt to change the way they track and grow with it. Buying a fancier car doesn’t mean you magically learn to drive better overnight. If you’re an enterprise exec and FDEs are sold as the immediate and sole solution to your company transformation woes, walk away. It’s the combination of tech *and* people enablement *and* process reinvention that compounds into actual business outcomes. Large complex enterprises will stall out if they only prioritize the first.

Aaron Levie

@levie

May 13

Forward deployed engineers, or equivalent, are about to become one of the most in-demand jobs in tech. And one of the most important functions for AI rollouts. Deploying agents is far more technical of a task than most people realize, often far more involved than deploying software. Software generally works the same way every time, and generally for the past few decades has been updated versions of an existing technology or concept (which basically means easier for the enterprise to update their workflows on a newer system). With agents, you’re actually deploying the equivalent of work output within the enterprise. The customer is effectively using you as a professional services provider for a task, which they expect to get solved nearly end-to-end now. This means you need to actually deeply understand the business process as a vendor, and get the customer from the current to the end state seamlessly. Companies need help figuring out which models will work best for their workflows, they need extensive evals setup often, they need change management support for workflows, they need to get their data setup for the agents, and constant tuning of the agentic system for their process. Massive role in tech now. And another example of the kind of highly technical work that AI is creating.

577

115,073

tae kim

lukex retweeted

tae kim

@firstadopter

Apr 17

Oh look! Anthropic's entire "we are delaying Mythos" narrative was marketing hogwash. Kudos to FT for confirming what was obvious. Anthropic simply doesn't have the compute. FT: "Multiple people with knowledge of the matter suggested Anthropic was holding back from a wider release until it could reliably serve the model to customers."

tae kim

@firstadopter

Apr 16

Everyone should read what's below. This is why actually knowing your stuff instead of naively regurgitating a particular startup's marketing propaganda bullet points is important. I've also included a screenshot of my Substack writeup of Nvidia's Bill Dally and Google's Jeff Dean GTC session that confirms Gavin's analysis.

204

1,801

448,513

CMEM

lukex retweeted

CMEM

@Claude_Memory

Apr 16

SIXTY THOUSAND (60,000) STARS ⭐️ TODAY ranked amongst legends, #301 and climbing

1,287

Charly Mwangi

lukex retweeted

Charly Mwangi

@charlythuo

Apr 10

Just spent a week in China deep diving the general-purpose robotics ecosystem. Key takeaway: while we’re vibe-coding… China is vibe-manufacturing ! A few things that stood out: 1) China has cracked “vibe manufacturing” Startups are spinning up hardware like we spin up code. AGIBot (3 years old) has already built ~10,000 robots. 2) The entire stack is being built in parallel. Every serious robotics company is full-stack: hardware controls foundation models. 3) Data factories are real and massive. Hundreds to thousands of people teleoperating robots 24/7 to generate training data. In some cases, the government is literally buying robots, generating data, and selling it back to companies. 4) The supply chain is overwhelming. Foxconn, BYD, LYitech - everyone is plugged into the same dense, hyper-responsive manufacturing base. This is why iteration speed is so high. 5) Structural paradox: Labor is both tailwind and headwind. Cheap, abundant skilled labor powers the supply chain… But it also makes automation harder to justify domestically. → Weak ROI for robotics inside China → Strong incentive to export 6) Hardware is impressive. Intelligence is not (yet). Amazing kinematics—dancing, acrobatics. But limited ability to execute simple instructions reliably. 7) Everyone is moving up the stack Every major CM/ODM is building their own robots—humanoids wheeled. Today’s suppliers will be tomorrow’s competitors. 8) Dexterity remains unsolved Lots of prototypes. Very few real demos. So what does this mean? Physical AI requires strength in both bits and atoms. Right now: China → dominates atoms (manufacturing, supply chain, scale) US → leads in bits (models, autonomy, software) We are dangerously behind in atoms. If we want to compete, incrementalism won’t cut it. We need to: - Build depth and breadth across the electro-mechanical supply chain - Scale CMs / ODMs / JDMs domestically - Move 100x faster, think 100x bigger on scaling manufacturing infrastructure Hats off to those doing their part to advance domestic manufacturing supply chain - @makematterco, @VulcanForms, @brightmachines, @thebotcompany @gs_ai_ , @MytraUS, @mind_robotics, @tesla_optimus, @atomic_inc, @Senra_Systems, @pathrobotics, @machinalabs_,@figure_robot, @HadrianInc , @agilityrobotics

138

652

80,906

Peter Steinberger 🦞

lukex retweeted

Peter Steinberger 🦞

@steipete

Apr 8

I'm working on character evals and noticed that Claude would constantly pick itself as #1, so I removed the model names from the judge and changed things.

947

109,258

Gergely Orosz

lukex retweeted

Gergely Orosz

@GergelyOrosz

Mar 31

This is either brilliant or scary: Anthropic accidentally leaked the TS source code of Claude Code (which is closed source). Repos sharing the source are taken down with DMCA. BUT this repo rewrote the code using Python, and so it violates no copyright & cannot be taken down!

442

1,229

12,863

2,195,935

lukex

lukex

@Lukex

Mar 31

the most entertaining timeline...Claude Code source code has been leaked. here're some upcoming features! h/t @Fried_rice

Josh Kale

@JoshKale

Mar 31

Anthropic just accidentally leaked Claude Code’s entire source… seriously 😳 Buried in the code are 4 secret features they haven’t announced yet. Here’s what’s coming: BUDDY - A Tamagotchi-style AI pet that lives next to your input box - 18 species. Rarity tiers. Shiny variants. Permanent personality. - Teaser drops April 1. Full launch May 2026. KAIROS - “Always-On Claude.” A persistent agent that runs across sessions. - Watches, logs, and proactively acts without you typing anything. - Has a nightly “dreaming” cycle that consolidates its memory. ULTRAPLAN - 30-minute deep planning sessions in the cloud. - Claude explores and builds a plan. You approve or reject in browser. - Can “teleport” the session to your local terminal when ready. COORDINATOR MODE - One Claude spawns multiple worker Claudes in parallel. - Workers report back with status, token usage, duration. - Multi-agent orchestration built directly into the CLI. This is the compiled code behind feature flags. They’re actively building all of this in secret.

382

Alfred Lin

lukex retweeted

Alfred Lin

@Alfred_Lin

Mar 27

A CEO from one of our portfolio companies shared this with their team. I’m re-sharing it with their permission, because it resonated and reflects what all founders and CEOs should be communicating. -- We are living through a period of compounding change. And in moments like this, the biggest risk is no longer making the wrong decision. It is moving too slowly while the world moves around you. There are two paths. We can play defense: - Protect what we have - Optimize what works - Wait for clarity It feels safe. It isn’t. Or we can play offense: - Learn faster than the environment changes - Use new tools to solve old problems in better ways - And create entirely new strategies and businesses That’s where the opportunity is. Challenge yourself to do things faster and better than you have ever attempted. Stay uncomfortable. Stay on the front foot.

110

428

2,952

899,074

lukex

lukex

@Lukex

Mar 18

1000% agree. you don't win your vertical deploying AI to 10x side quests, while your competitors are deploying AI to 10x main quests

Gergely Orosz

@GergelyOrosz

Mar 17

Sage observation from @karrisaarinen (CEO of Linear) It now makes SO MUCH sense why I see a bunch of eng teams rebuilt a SaaS vendor in-house with AI, brag about and feel good They are doing side quests... and they don't even know it. And they are not helping their co win!!

321

Xiaoyin Qu

lukex retweeted

Xiaoyin Qu

@quxiaoyin

Mar 10

The scariest thing about AI in 2026 isn't some sci-fi scenario. It's watching people you know — people with the same credentials, the same caliber — split into two completely different groups in a matter of months. I've seen it happen firsthand. Stanford grads, ex-Meta engineers, startup founders. Three months ago, they were all roughly at the same level. Now? The divergence is so obvious it's uncomfortable. Some of them got really good at AI. Not just "using ChatGPT" good — fundamentally different in how they think, work, and produce. Their output is compounding. Their depth of insight is compounding. They look like they're playing a different game entirely. Others are still running on the resume they built five years ago. And here's the number that haunts me: 99% of people still use AI at the level of "What's the weather today?" or "What kind of flower is this?" The 1% who figured it out aren't even one group. There's massive variance within them — some are orchestrating AI agents to run entire companies, some use it for research that would take a whole team, some have AI write half their code, some have AI write all of it. The income implications are brutal. If someone uses AI to produce the output of 10,000 people, they're worth 10,000x the salary. Someone who can't figure out a single tool? They might not be worth hiring at all. What really unsettles me is how fast our patience is eroding. The moment we feel someone performs below what AI can do, we don't think "they need training." We think "they're worth zero." Not less. Zero. So the real AI danger isn't AI going rogue. It's the epic, unprecedented amplification of the gap between people — in capability, in income, in relevance. One silver lining: the old hierarchy is broken. People who were once untouchable can now be overtaken by someone who masters AI faster. That door is genuinely open. But if you don't walk through it, you won't just fall behind by a little. You'll become invisible. #AISkillGap #FutureOfWork #ArtificialIntelligence #Productivity

382

85,800

lukex

lukex

@Lukex

Mar 10

pretty wild

Poe Zhao

@poezhao0605

Mar 10

MiniMax surpassed Baidu in market cap today. HK$383 billion vs HK$332 billion. Stock is up 51% in two days. Here is the thing. MiniMax made $79 million in 2025 revenue. Baidu made $18.9 billion. That is a 239x gap. Yet the market now values MiniMax higher. I published deep dives on both companies' earnings this month. MiniMax's report was the AI industry's first open book from a pure-play model company. Baidu's report revealed why $5.8 billion in AI revenue still wasn't enough. Two earnings. Two very different answers to the same question: what does AI survival look like? MiniMax analysis: hellochinatech.com/p/minimax… Baidu analysis: hellochinatech.com/p/baidu-a…

610

lukex

lukex

@Lukex

Mar 10

Most friends in the US/West are just hearing about public OpenClaw installs in Shenzhen, and not this yet Government moving at the pace of startups is a sight to behold

Poe Zhao

@poezhao0605

Mar 9

OpenClaw mania in China has crossed from tech hype to government policy. Shenzhen moved first. Now Wuxi's high-tech zone dropped a 12-point draft policy specifically supporting OpenClaw-based development. Compute subsidies up to $42K/year. Full cloud platform subsidies up to $140K. And up to $700K for breakthroughs in embodied AI robots and smart industrial inspection. This is how China scales technology. Government sets the table with money and policy. Companies bring the products.

676

Aakash Gupta

lukex retweeted

Aakash Gupta

@aakashgupta

Feb 27

Anthropic is running a masterclass in negotiation-as-marketing right now. The $200M Pentagon contract represents 1.4% of Anthropic’s $14 billion run rate, up 14x from $1 billion fourteen months ago. This is not a number worth compromising a brand over. Amodei knows this. The Pentagon knows this. So why is he personally publishing a detailed statement, point by point, timed for maximum news cycle impact? Because every headline that reads “AI company refuses Pentagon’s demands on autonomous weapons and mass surveillance” is worth more than the contract. Anthropic just bought the most expensive brand positioning in AI history, and the Pentagon is paying for it. The statement is surgically written. Amodei opens by affirming he believes in using AI to defend democracies. Lists every classified deployment Anthropic pioneered. Emphasizes they’ve never objected to specific military operations. Then draws two narrow lines: no mass surveillance of Americans, no fully autonomous weapons. The framing makes it almost impossible to argue against without sounding like you’re pro-surveillance. The Pentagon’s negotiator called Amodei a “liar” with a “God complex.” The Pentagon threatened to invoke the Defense Production Act and label Anthropic a supply chain risk simultaneously. Amodei pointed out those two threats are contradictory: one says Anthropic is dangerous, the other says Claude is essential. That line will be in every news story for the next 48 hours. It was designed to be. Sen. Tillis, a Republican not seeking reelection, broke with the administration on the record. Said the Pentagon was being “unprofessional” and that you should listen when a company turns down money out of concern for consequences. Anthropic didn’t have to lobby for that. The positioning did the work. Every enterprise buyer evaluating AI vendors just watched Anthropic publicly refuse to let a customer override their safety commitments. For a company selling to regulated industries, that demo is priceless. The 5:01pm Friday deadline is tomorrow. Anthropic will either keep the contract with safeguards intact or lose it and gain something more valuable: permanent differentiation in a market where every other lab said yes.

Anthropic

@AnthropicAI

Feb 26

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War. anthropic.com/news/statement…

118

257

2,117

366,144

lukex

lukex

@Lukex

Feb 27

good for humanity

Anthropic

@AnthropicAI

Feb 26

A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War. anthropic.com/news/statement…

495