We are moving from handcrafted software to auto-evolving software ecosystems.
Focusing on parallels to MTBF vs MTTR debate and on stable, mature systems may miss the deeper shift and the vast horizon of new opportunity.
For decades, software creation was bottlenecked by human implementation bandwidth. We built systems slowly, carefully, almost like mammal offspring: relatively few systems, deeply maintained, individually understood.
AI changes the reproductive economics of software creation itself.
Now individuals and organizations can spawn fleets of agents, workflows, experiments, products, and companies in parallel at scales humans simply could not before. Variation explodes. Evolution accelerates. Selection pressure intensifies. While any one project or system may flake the macro system selects for survival.
Yes, this can absolutely create fragile catastrophe machines.
Nature does too.
Resilient ecosystems evolve immune systems, redundancy, observability, adaptation, and recovery mechanisms instead of relying on perfect top-down comprehension of every component.
These are often emergent, not architected.
Some layers should absolutely optimize for stability and rigor.
Others may benefit enormously from rapid mutation and exploration.
Our role shifts from handcrafting every component toward shaping environments, incentives, guardrails, and feedback loops for increasingly generative ecosystems.
Even if one organization or person chooses not to operate this way, others will.
That pressure is already here.
And we humans are co-evolving alongside AI systems in this transition.
I probably still don’t want my dentist to use vibe coded X-Ray dosing software at this point though.
I strongly believe there are entire companies right now under heavy AI psychosis and its impossible to have rational conversations about it with them. I can't name any specific people because they include personal friends I deeply respect, but I worry about how this plays out.
I lived through the great MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery) reckoning of infrastructure during the transition to cloud and cloud automation. All those arguments are rearing their ugly heads again but now its... the whole software development industry (maybe the whole world, really).
It's frightening, because the psychosis folks operate under an almost absolute "MTTR is all you need" mentality: "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" We learned in infrastructure that MTTR is great but you can't yeet resilient systems entirely.
The main issue is I don't even know how to bring this up to people I know personally, because bringing this topic up leads to immediately dismissals like "no no, it has full test coverage" or "bug reports are going down" or something, which just don't paint the whole picture.
We already learned this lesson once in infrastructure: you can automate yourself into a very resilient catastrophe machine. Systems can appear healthy by local metrics while globally becoming incomprehensible. Bug reports can go down while latent risk explodes. Test coverage can rise while semantic understanding falls. Changes happens so fast that nobody notices the underlying architecture decaying.
I worry.