Builds and operates billion-QPS distributed systems at Amazon. Databases and Engineering Management. Opinions mine.

Joined May 2009
58 Photos and videos
The critical question is how much revenue Meta loses due to such outages. If the loss is less than the productivity gain achieved by shipping pure AI generated code without human reviews, AND if it’s hard for customers to leave the platform, then it can be seen as a non-issue.
Meta had a SEV-0 outage today… less than two weeks after Meta’s most embarrassing undetected-for-too-long account takeover (also an outage) It’s impossible to unsee Meta pushing AI for code reviews and the end result being more massive outages vs before They are connected
3
581
Highlights from May 7 Coinbase outage: * 1 AWS AZ failed, taking down EC2&EBS. A critical Coinbase component was deployed to only that AZ * All Coinbase trading stopped, with no automation to failover to another AZ Unsurprisingly, an AZ failure turned into total service failure
2
7
7,153
Plus, a 2-AZ Managed Kafka cluster which is supposed to survive a single AZ outage didn't survive the outage due to a Control Plane issue on the AWS side. This impacted Coinbase's fee and quoting capabilities.
1
2
790
Why do people add indexes to tables? At first it was one or two misguided DBAs, now it’s spreading like a disease. It makes no sense. You are making writes slower, and everyone hates you while the index is being built. Why not just do a full table scan every time?
Why do people reverse into parking spaces? At first it was one or two misdirected fools, now it's spreading like a disease. It makes no sense. You are only making it so much harder for yourself, and everyone hates you as you block the road trying to get into the space.
9
5,299
The guy thinks of “you” as a big wiki page. Never ceases to amaze how these nerds never understood the role humans play in big organisations.
Imagine replacing 90% of your employees with a team of geniuses who have no idea how your company operates. Total chaos. Nothing works. That’s what AI feels like today. The missing piece is extracting all the domain knowledge from people’s heads and providing that as structured context to the models.
1
60
4,443
I remember the first time I felt like not checking the price of groceries while shopping. It felt magic and I thought, wow, there are people who lived this way their entire lives. Now it’s clear that this kind of happiness is not something you can have if you are born into it.
I remember the day my salary crossed $100k - I was overjoyed. Software dev making six figures. Teacher wife making $40k. I went to the rich people grocery store and bought us a tub of the fanciest olives. I didn’t even like olives. We were about to have a baby. Life was good.
2
18
1,692
Note the parallelism between Dunning-Kruger and LLM psychosis: if you are unable to expertly verify the output, it’ll all look good to you with hidden mistakes you can not fix. This has costly implications in legal matters, critical software, medicine, etc.
I used AI to double check my annual closing statements of my company - fed it bank statements, accounting statements etc, parallel to my accountant going thru it as well. It found some good stuff, but also hallucinated a bunch of other, nonexistent problems. I feel more and more LLMs are excellent tools... in the hands of professionals who can easily confirm hallucination vs real stuff.
2
1
12
2,932
An addition to this excellent observation: in most companies, your ability to go higher in ranks doesn’t depend on what “you can deliver” (with or without AI), but “what you can influence” by pushing for the right things and winning those arguments. Do not give up on this skill.
Situation 1: dev A thinks approach X is correct, dev B thinks Y is the right way. They argue and try to convince each other. Situation 2: dev A thinks approach X is correct, tells the LLM to implement it. There is SO MUCH learning in Situation 1, lost when using LLMs....
4
469
A good rule for the AI era: Don’t write anything you couldn’t confidently defend out loud in a room full of smart, curious people. AI can help you write faster. It can’t take responsibility for what you say.
asking people to read ai-generated text is offensive. this is not because ai text is intrinsically bad. rather, the author has not paid a cost to write the text himself. this cost is a credible signal he finds its communication important. so: not paying that cost is telling
1
5
287
AI is a great tool, but the degradation in quality and resilience is still happening due to a combination of false advertising by AILabs and a lack of expertise in the general SWE population.
One of the biggest anit-ai stances I have is that resilience is becoming a lost art. Learned helplessness on full display regularly
1
11
535
Pre-AI, you’d be supervised by someone senior who could give you good advice from their experience. Now everybody is in a rush to YOLO what the AI magic box says, without much thought on what knowledge gap could be stopping them from validating the right behaviour.
4
141
Do code reviews still matter? No, if your company has excellent infra with comprehensive telemetry, canaries for all use cases, anomaly-detection, auto-rollback, AND minor consequences when you break your customers. Yes, otherwise.
It’s 2018 and your coworker just sent you a 400 line pull request. You get a cup of coffee and sit down to review it. It’s beautiful. Elegant micro-refactors. Crispy method names. You catch a few things, but that’s ok. It’s part of the dance. They didn’t consider extensibility on part of their API. Here’s a comment buddy. They respond in an hour saying they think we should do one piece differently than your comment. Hey let’s jump into a room and figure it out. We can’t just agree to disagree, this code is too important. The PR merges and goes to prod. You feel a shared sense of ownership and accomplishment. That night you go to sleep and dream of that code. You can still see the shapes of it on the backs of your eyelids, your IDE syntax highlighting sparking neurons in your reptile brain. You go to work the next day ready to go. You understand the system. N is your foundation. Time to build n 1.
1
218
Summarised version: “What if one person makes all the decisions and ships the product without asking anyone for insights or feedback?” The reality is that the industry doesn’t have enough people who can pull this off successfully and consistently.
POD-OF-ONE: THE NEW ORG BUILDING BLOCK As a @coinbase board member, t’s been a privilege to watch @brian_armstrong @emiliemc, and the Coinbase team build a true AI-native company. Brian's whole post is worth reading in depth. I want to focus in on one thing that Coinbase is testing: “one-person product teams.” Most of the AI discourse has focused on one-person companies. The more powerful and more broadly applicable construct will likely be one-person teams inside companies. The old product org split context across 3 people. The designer held the user experience. The PM held the customer and prioritization context. The engineer held the code and systems context. Coordination was the price you paid to combine those views into one shipping decision. Agents reduce that coordination cost. A single high-agency person can now ask agents to draft flows, write code, run QA, summarize customer feedback, generate variants, check edge cases, and produce release notes. This model rewards a very specific kind of builder: • Technical enough to inspect the work • Product-minded enough to choose the right problem • Tasteful enough to reject mediocre output • Fast enough to ship before the org forms around the idea The scarce skill is judgment. One strong person with customer context and good taste can now do the work of a small pod. One weak person with agents just creates more output for someone else to review. This changes how early-stage founders should hire. The most useful hiring question is now: “Can this person own the outcome end-to-end?” That’s a higher bar than a functional job description. It blends product sense, technical range, design taste, writing clarity, and operating discipline. The title matters less. The span matters more. Call it pod-of-one thinking. A pod-of-one builder can go from ambiguous customer pain to shipped v1 without waiting for specs, mocks, tickets, handoffs, or meetings. Agents fill in missing labor. The human carries the context. Teams still matter. They should form when the surface area is real: multiple customer segments, production risk, complex GTM loops, or enough product depth that specialization pays for itself. Before that, a pod-of-one may be the fastest shipping unit in the company. Founders: hire people who can be pods-of-one, who can carry the whole problem in their head and use agents to increase their throughput.
2
250
“You can't just be these managers where you're people's therapists and you're just doing meetings, just 1-on-1s”. At least one of the followings is true: 1/ Brian can’t set the right goals for his managers. 2/ Brian is unable to hire good managers. Neither is relevant to AI.
Brian on why pure people managers won't survive AI: "I don't think people that only manage people will have any value in the future. Everyone's going to have to be a hybrid people manager or manager IC. In other words, even the managers need to code. You can't just be these managers where you're people's therapists and you're just doing meetings, just one-on-ones. People who have lots of recurring one-on-ones are not going to survive. That kind of leadership style is not gonna work. You need to have context. I hear about heads of design, they don't actually manage the design. Johnny Ive manages the design. He designs and he leads people. A design leader who only manages the people that's crazy to me. The way Frank Lloyd managed his design team is through the work. You don't manage the people, you manage the work. I think a lot of people will survive this age of AI. The two types of people that will not survive are pure people managers, and people that are rigid and don't want to change and evolve."
1
201
Truly fascinating to watch the AI industry’s continuous rediscovery of decades-old software project management practices.
POV: claude traveled 6 months into the future and told you exactly how your next move failed. it's called a premortem. daniel kahneman (nobel prize-winning psychologist behind "thinking fast and slow") called it his single most valuable decision-making technique. google, goldman sachs, and procter & gamble all use it before major launches. here's the problem it solves. when you ask claude "is this a good plan?" it finds all the reasons to say yes. that's what it was trained to do. so you walk away feeling confident. you execute, and spend weeks / months building on top of that plan. then it blows up. and you realize the problem was obvious in hindsight, you just never stress-tested it because claude told you it was solid. a premortem fixes this by flipping the frame. instead of asking "what could go wrong?" you tell claude "it's 6 months from now and this is already dead. tell me how it died." that shift turns off claude's optimism because there's nothing to be optimistic about. the premise already says it failed. so claude stops looking for reasons your plan will work and starts explaining how it fell apart. claude comes back with every way your plan could die, each one with a full failure story and the early warning signs to watch for. then a synthesis pulls it all together: > which failure is most likely > which failure is most dangerous > the single biggest hidden assumption you're making (often the most valuable part) > a revised version of your plan with the gaps closed you say "premortem this" and give it your plan. the skill handles the rest.
3
227
For some reason people think that when you can get your code produced for you, you will by default build a great product.
For 50 years, software engineering ran on code rationing. Writing code was expensive, so we rationed it carefully through roadmaps, RFCs, prioritization meetings, and scope reviews. This created a role: the No Engineer. No, that won't scale. No, we don't have bandwidth. No, that's out of scope. No, we need a design doc first. The No Engineer was valuable for 50 years. Every "no" saved real money. Their judgment was the rationing system. LLMs will be the end of code rationing. Code is cheap now. And while the No Engineer is explaining why something can't be done, the Yes Engineer has already shipped three versions of it. If you're a Yes Engineer, the next decade is yours.
9
530
When building prototypes was costly, companies needed to do a lot of thinking ahead by collecting as much data as possible from customers, to decide what to build next. Now they just try to build everything in parallel, which contributes to terribly bloated, unusable products.
While AI agents make building software much faster (esp for experienced devs) - they seem to not make it any easier for early-stage startups to find PMF. Talked with 2x “AI-pilled” founders who are v productive devs, are building their startups. It remains damn hard, AI or not
11
742
Horror story of the day: Agent destroys prod data with its backups ignoring explicit guardrails. - Volume-level backups in the same volume are as good as inexistent - Root-level, unscoped permissions are a no-no for an agent - Destructive operations still require humans
2
4
482
The major take-away for the rest of us: Don’t believe the claim LLMs can be deterministically controlled with safeguards. Once you accept this, you can never leave your operations to a fully autonomous agent. Means no dream land where software engineers do not exist.
97