Passionate about the craft of development. Beekeeper and brewer. Helping ISVs @ AWS. Opinions are my own.

Joined May 2007
336 Photos and videos
Why would you wait until you are on the taxiway to check with maintenance to see if a flaky PA system is gonna be a problem?
1
38
The further you get from the coasts, the easier it is to understand how wide the aperture of success, contentment, and meaning actually is.
Replying to @matvelloso
Amen. I remember how sad I was to move away from the Bay Area. Thought life had ended bc I was no longer in what I'd believed was the center of the universe. Once outside it was like my brain rewired and I could finally see that most people live happy lives that aren't fixated on tech. It seems like such a silly thing to say but in the Valley's reality distortion field you come to think that there really is no other life beyond RSUs and outsized payouts.
2
116
Jeffrey Hammond retweeted
At the Agents Anonymous SF meetup last night we did another 🙋 AI usage survey, here are the est. numbers: Usage stats: - 90% Claude Code - 60% Codex - 30% Cursor - 20% OpenCode - 10% Conductor - 10% Own agent/Pi 80% have prompted a coding agent from mobile 50% have not handwritten a single line of code this year 99% think they're more productive now vs. pre agentic coding agents Parallel agent usage: - 90% 3 - 70% 4 - 50% 5 - 5% 10 Also want to give a ginormous thank you to our incredible speaker lineup: - @jonas_nelle & @alexirobbins from @cursor_ai - @southpolesteve from @Cloudflare - @LewisJEllis from @ycombinator - @aidandcunniffe from Git AI - 🦞 @steipete from @openclaw Hope to see you all at the next one! 🫡
109
120
1,079
174,263
Jeffrey Hammond retweeted
Creator of Kotlin just dropped a new spec driven programming language. You just write the specs and AI Agent will generate the code.
49
34
367
87,238
Seems like we’re getting another bite at the MDD apple, but not in the manner we originally envisioned it…
40
Jeffrey Hammond retweeted
When a team fully activates on agents and each engineer is shipping 10 PRs a day minimum, the entire traditional SDLC collapses. Code review backlog, keeping up with docs, keeping CI and the merge queue fast, planning, visibility, etc. You end up redesigning the whole SDLC.
26
20
196
27,238
Jeffrey Hammond retweeted
Citadel Securities published this graph showing a strange phenomenon. Job postings for software engineers are actually seeing a massive spike. Classic example of the Jevons paradox. When AI makes coding cheaper, companies actually may need a lot more software engineers, not fewer. When software is cheaper to build, companies naturally want to build a lot more of it. Businesses are now putting software into industries and tools where it was simply too expensive before. --- Chart from citadelsecurities .com/news-and-insights/2026-global-intelligence-crisis/
420
1,326
9,806
2,030,449
Jeffrey Hammond retweeted
Eerie feeling: Talking to people at software companies and getting the impression that they're still acting like it's 2022. Huge teams, roadmaps, product vs. eng vs. design, "haha that'll take a while", AI seen as a "new" thing, no urgency.
123
70
1,890
155,459
Jeffrey Hammond retweeted

270
1,256
9,795
7,230,081
Jeffrey Hammond retweeted
"You're better off taking more technical debt in your projects and bet on the fact that LLMs will clean up the debt in the future." singularitea.bearblog.dev/te… < counter-cultural take, and that at least catches my attention. Become a worse software engineer, without stressing it?
1
1
1
544
Jeffrey Hammond retweeted
🚀 Introducing Strands Agent SOPs: Natural language workflows for reliable AI automation! From code reviews to feature development, meeting notes to incident response. Author new SOPs in minutes, chain for complete automation sequences. go.aws/47R58zI Born from Amazon's builder community, now open-source! #AI #Automation #OpenSource
1
5
17
2,303
Jeffrey Hammond retweeted
21 Nov 2025
This killed me.
105
889
6,114
289,186
Jeffrey Hammond retweeted
13 Nov 2025
This story is wild Chinese state-backed hackers hijacked Claude Code to run one of the first AI-orchestrated cyber-espionage Using autonomous agents to infiltrate ~30 global companies, banks, manufacturers and government networks🤯 How the attack was carried out in 5 phases
13 Nov 2025
We disrupted a highly sophisticated AI-led espionage campaign. The attack targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We assess with high confidence that the threat actor was a Chinese state-sponsored group.
230
1,804
9,723
1,517,941
Jeffrey Hammond retweeted
18 Oct 2025
If you saw how people actually use coding agents, you would realize Andrej's point is very true. People who keep them on a tight leash, using short threads, reading and reviewing all the code, can get a lot of value out of coding agents. People who go nuts have a quick high but then quickly realize they're getting negative value. For a coding agent, getting the basics right (e.g., agents being able to reliably and minimally build/test your code, and a great interface for code review and human-agent collab) >>> WhateverBench and "hours of autonomy" for agent harnesses and 10 parallel subagents with spec slop
My pleasure to come on Dwarkesh last week, I thought the questions and conversation were really good. I re-watched the pod just now too. First of all, yes I know, and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my thinking thread, so I think I botched a few explanations due to that, and sometimes I was also nervous that I'm going too much on a tangent or too deep into something relatively spurious. Anyway, a few notes/pointers: AGI timelines. My comments on AGI timelines looks to be the most trending part of the early response. This is the "decade of agents" is a reference to this earlier tweet x.com/karpathy/status/188254… Basically my AI timelines are about 5-10X pessimistic w.r.t. what you'll find in your neighborhood SF AI house party or on your twitter timeline, but still quite optimistic w.r.t. a rising tide of AI deniers and skeptics. The apparent conflict is not: imo we simultaneously 1) saw a huge amount of progress in recent years with LLMs while 2) there is still a lot of work remaining (grunt work, integration work, sensors and actuators to the physical world, societal work, safety and security work (jailbreaks, poisoning, etc.)) and also research to get done before we have an entity that you'd prefer to hire over a person for an arbitrary job in the world. I think that overall, 10 years should otherwise be a very bullish timeline for AGI, it's only in contrast to present hype that it doesn't feel that way. Animals vs Ghosts. My earlier writeup on Sutton's podcast x.com/karpathy/status/197343… . I am suspicious that there is a single simple algorithm you can let loose on the world and it learns everything from scratch. If someone builds such a thing, I will be wrong and it will be the most incredible breakthrough in AI. In my mind, animals are not an example of this at all - they are prepackaged with a ton of intelligence by evolution and the learning they do is quite minimal overall (example: Zebra at birth). Putting our engineering hats on, we're not going to redo evolution. But with LLMs we have stumbled by an alternative approach to "prepackage" a ton of intelligence in a neural network - not by evolution, but by predicting the next token over the internet. This approach leads to a different kind of entity in the intelligence space. Distinct from animals, more like ghosts or spirits. But we can (and should) make them more animal like over time and in some ways that's what a lot of frontier work is about. On RL. I've critiqued RL a few times already, e.g. x.com/karpathy/status/194443… . First, you're "sucking supervision through a straw", so I think the signal/flop is very bad. RL is also very noisy because a completion might have lots of errors that might get encourages (if you happen to stumble to the right answer), and conversely brilliant insight tokens that might get discouraged (if you happen to screw up later). Process supervision and LLM judges have issues too. I think we'll see alternative learning paradigms. I am long "agentic interaction" but short "reinforcement learning" x.com/karpathy/status/196080…. I've seen a number of papers pop up recently that are imo barking up the right tree along the lines of what I called "system prompt learning" x.com/karpathy/status/192136… , but I think there is also a gap between ideas on arxiv and actual, at scale implementation at an LLM frontier lab that works in a general way. I am overall quite optimistic that we'll see good progress on this dimension of remaining work quite soon, and e.g. I'd even say ChatGPT memory and so on are primordial deployed examples of new learning paradigms. Cognitive core. My earlier post on "cognitive core": x.com/karpathy/status/193862… , the idea of stripping down LLMs, of making it harder for them to memorize, or actively stripping away their memory, to make them better at generalization. Otherwise they lean too hard on what they've memorized. Humans can't memorize so easily, which now looks more like a feature than a bug by contrast. Maybe the inability to memorize is a kind of regularization. Also my post from a while back on how the trend in model size is "backwards" and why "the models have to first get larger before they can get smaller" x.com/karpathy/status/181403… Time travel to Yann LeCun 1989. This is the post that I did a very hasty/bad job of describing on the pod: x.com/karpathy/status/150339… . Basically - how much could you improve Yann LeCun's results with the knowledge of 33 years of algorithmic progress? How constrained were the results by each of algorithms, data, and compute? Case study there of. nanochat. My end-to-end implementation of the ChatGPT training/inference pipeline (the bare essentials) x.com/karpathy/status/197775… On LLM agents. My critique of the industry is more in overshooting the tooling w.r.t. present capability. I live in what I view as an intermediate world where I want to collaborate with LLMs and where our pros/cons are matched up. The industry lives in a future where fully autonomous entities collaborate in parallel to write all the code and humans are useless. For example, I don't want an Agent that goes off for 20 minutes and comes back with 1,000 lines of code. I certainly don't feel ready to supervise a team of 10 of them. I'd like to go in chunks that I can keep in my head, where an LLM explains the code that it is writing. I'd like it to prove to me that what it did is correct, I want it to pull the API docs and show me that it used things correctly. I want it to make fewer assumptions and ask/collaborate with me when not sure about something. I want to learn along the way and become better as a programmer, not just get served mountains of code that I'm told works. I just think the tools should be more realistic w.r.t. their capability and how they fit into the industry today, and I fear that if this isn't done well we might end up with mountains of slop accumulating across software, and an increase in vulnerabilities, security breaches and etc. x.com/karpathy/status/191558… Job automation. How the radiologists are doing great x.com/karpathy/status/197122… and what jobs are more susceptible to automation and why. Physics. Children should learn physics in early education not because they go on to do physics, but because it is the subject that best boots up a brain. Physicists are the intellectual embryonic stem cell x.com/karpathy/status/192969… I have a longer post that has been half-written in my drafts for ~year, which I hope to finish soon. Thanks again Dwarkesh for having me over!
39
65
830
183,764
Jeffrey Hammond retweeted
Bob Ross vibe coding was the AI slop I never knew I needed in my life
163
969
9,340
470,246
Jeffrey Hammond retweeted
$24,000 per year from this simple AI Dentist Voice Agent (and why I'm crazy for giving it away for free) A dental practice was losing $6,000 in revenue every month from missed after-hours calls. That's 20-25 potential patients walking away because no one was available to book their appointments. So I built an AI voice assistant that handles after-hours dental bookings 24/7 using n8n and ElevenLabs based on internal policies and scheduling availability. Here's what this system does: → Answers calls with a natural-sounding AI receptionist → Collects patient information and insurance details → Checks calendar availability in real-time → Books appointments automatically → Logs all patient details to a Google Sheet The result? This similar AI voice system was sold to a dental practice for $24k per year by another entrepreneur!! This isn't just about dental practices. Any service business losing money from missed calls can implement a similar system. Want the complete n8n workflow template? 1. Retweet & Like this post 2. Comment "ASSISTANT" I'll send you the entire system for free, a full setup walk-through video, including the ElevenLabs automation components.
986
783
3,045
352,640
Jeffrey Hammond retweeted
Multi-agent AI is a $50B lie. 99% of "multi-agent" systems are just single agents with fancy marketing. I just read the paper that exposes what real multi-agent intelligence actually looks like. Most people think multi-agent AI is just "multiple ChatGPTs in a room. That's like saying a surgical team is just "multiple people with knives." The real story is way deeper. Task allocation is completely broken. Current systems are basically throwing darts at a board. Give the math problem to whoever's free. Ask the creative agent to debug code. It's chaos disguised as intelligence. Real multi-agent systems need dynamic specialization. Not just "Agent 1 does X, Agent 2 does Y" but context-aware matching based on capability, workload, and past performance. The memory problem is insane. Single agents just track conversations. Multi-agent systems need five different memory types: short-term task state, long-term expertise, episodic collaboration history, consensus knowledge, and hierarchical access control. Most current systems give every agent amnesia between tasks. Context management is where everything breaks. Each agent needs to track three layers simultaneously: the big picture mission, their specific piece, and what everyone else is doing. Fail at any layer and the whole system becomes expensive nonsense. Game theory matters more than code. When agents debate or negotiate, you're not optimizing for "correctness." You're finding equilibrium states. The research shows Stackelberg dynamics work better than Nash equilibrium for most real tasks. Nobody talks about this because it's not as sexy as "look, the robots are talking." The applications they outline are wild. Agents that negotiate smart contracts autonomously. Fraud detection where different specialists hunt different attack patterns. Consensus mechanisms that actually think through decisions. We're not building better chatbots. We're building the foundation for autonomous economic systems. The gap between current "multi-agent" demos and actual multi-agent intelligence is massive. Real systems will have specialized roles, shared memory architectures, and game-theoretic coordination. They'll solve problems no individual agent can handle. Same principle that makes human teams work. Just faster, and at scale. Most of what people call "multi-agent" today is just single agents with fancy prompting. The companies that figure out real multi-agent coordination first will have a 10x advantage. Everyone else is building expensive theater.
64
197
1,151
121,229
Jeffrey Hammond retweeted
16 Sep 2025
Complex orchestration frameworks for AI agents - state machines, workflows, error handling - seem necessary. But what if they're actually holding back what modern LLMs can do?
3
3
9
2,562
Jeffrey Hammond retweeted
28 Aug 2025
We recently ran a successful live failover of our control plane database from Azure West US to East US 2. Our goal was to validate that Vercel can survive a complete regional failure of its core database. Here's the full breakdown: vercel.com/blog/preparing-fo…
9
14
252
126,097