Some of the stories they aren't telling you:
• Chevrolet's chatbot sold a car for $1
• Air Canada had to honor a refund policy that its chatbot made up
• A pipeline ran 20x over cost for 6 days without anyone noticing
People didn't realize because nothing broke. There were no crashes and no alerts.
That's the issue with agentic applications.
They always generate something that looks coherent and don't raise any suspicion unless it's too late.
There's an amazing free YouTube lecture and blog post from
@arshdilbagi that will help you fix this with a practical framework.
Here is what you'll learn:
• How to set up end-to-end trace instrumentation
• How to build alerts around a silent failure taxonomy
• An eval system built from production data
• Complete and concrete implementation steps
Every section of the blog ends with exactly what to do next.