AI red teams today are stuck doing workflow engineering instead of finding vulnerabilities. Weeks spent on infrastructure, when they could be probing for security and safety risks.
At the same time, traditional ML and generative AI security remain siloed across different libraries and tooling ecosystems, creating long-term operational and maintenance burden.
We built an agentic AI red teaming system on the Dreadnode SDK to flip this narrative, accelerating testing from weeks to hours. Operators describe the objective in plain English; the agent handles attack selection, workflow generation, execution, and reporting.
In our latest paper, we dive deep into the AI red team agent architecture, our methodology, the complete attack and transform catalog, the analytics pipeline… and then we pointed it at Meta's Llama Scout. The result:
→ 674 attacks, 573 findings, 7,727 trials
→ 232 critical vulnerabilities across 68 objectives
→ ~85% attack success rate
→ ~3 hours, zero human-written code
AI red teaming today looks like software development before agent-assisted coding: skilled operators spending most of their time on infrastructure rather than on the work that requires their judgment.
The transition isn't necessarily about replacing the operator. It's about moving the operator's expertise up a layer, from which Python function should I call ➡️ what's worth probing, what risks do we care most about, and what do the results mean for my AI strategy.
Blog:
dreadnode.io/research/redefi…
Paper:
arxiv.org/abs/2605.04019