Today we're releasing HiL-Dynamics, the first open-source tool that measures how production agents actually collaborate with humans under uncertainty. Not just whether they got the answer.
Now you can measure exactly when your agent asks for help, when it makes assumptions, and when it'll confidently ship the wrong answer.
Our findings 🧵