Josh Devon

Josh Devon

46 Photos and videos

Tweets

Pinned Tweet

Josh Devon

@joshdevonai

Mar 6

@Al_Grigor We open sourced a policy engine that uses policy-as-code to create deterministic hard boundaries that stop the agent from making mistakes like this, even while running in YOLO mode: github.com/sondera-ai/sonder… And you can read the write-up of this approach here: securetrajectories.substack.…

GitHub - sondera-ai/sondera-coding-agent-hooks: Hooking implementations and supporting tools for...

Hooking implementations and supporting tools for various coding agents (Claude, Cursor, Gemini, etc) - sondera-ai/sondera-coding-agent-hooks

github.com

153

49,996

Bob Gourley - e/acc

Josh Devon retweeted

Bob Gourley - e/acc

@bobgourley

Jun 11

Many Black Hat talks are good. Some are awesome. This one @joshdevonai told me about is definitely going to be worth watching: blackhat.com/us-26/arsenal/s…

292

Josh Devon

Josh Devon

@joshdevonai

Jun 10

Loop engineering is the new hotness, but everyone is just noting in passing that loops can run up your token bill, which to me is the actual headline. The moment an agent runs its own loop, the thing deciding how much to spend is unattended, which makes cost a security problem, not just a finance one.

Josh Devon

Josh Devon

@joshdevonai

Jun 10

Covered some of the research on the ways the agent bill can run away and how to contain it here: blog.sondera.ai/p/ai-agent-t…

Your Agent Doesn't Care What It Costs

Token burn has become a security and resiliency risk, and the only way to control the bill is to govern what the agent does.

blog.sondera.ai

Jimmy Koppel

Josh Devon retweeted

Jimmy Koppel

@jimmykoppel

Jun 8

If AI’s coding 100x faster, why aren’t you shipping 100x faster? I’ve interviewed dozens of builders to find out. Here’s what’s slowing you down

8,659

Trail of Bits

Josh Devon retweeted

Trail of Bits

@trailofbits

Jun 3

We built four malicious skills to test whether skill scanners actually work. Three took less than an hour to conceive and implement. ClawHub, Cisco, and Vercel's skills.sh marked them as safe. 🧵

The Agent Skills Directory

Discover and install skills for AI agents.

skills.sh

277

31,241

Josh Devon

Josh Devon

@joshdevonai

Jun 2

Due to context rot, an LLM judging an AI agent's behavior gets less reliable the longer the agent runs. Its accuracy decays with transcript length.

more replies

Josh Devon

Josh Devon

@joshdevonai

Jun 2

Fine-tuning and prompt reminders don’t seem to help improve detections much. We can and still should use LLMs-as-judges, but we need a compensating control that has a different failure mode, like deterministic detections whose accuracy holds no matter how long the run gets.

Josh Devon

Josh Devon

@joshdevonai

Jun 2

Full post on context rot in LLM monitors: blog.sondera.ai/p/llm-as-jud…

Josh Devon

Josh Devon

@joshdevonai

May 27

Every kid has done the PB&J instruction exercise. Now we're doing it with our agents. We say, "put the peanut butter on the bread," and they put the closed jar of peanut butter on top of the unopened bag of bread.

more replies

Josh Devon

Josh Devon

@joshdevonai

May 27

We tell agents: "Delete the test data," and it deletes prod. "Reconcile the ledger," and it stores financials on the public web. "Send the report to the team," and it sends to a Slack channel with a customer in it. The gap between language and intent is vast.

Josh Devon

Josh Devon

@joshdevonai

May 27

The PB&J Problem isn't fixed with only better prompts. The cook will always find a way in the action space to misinterpret intent. The fix is really in the kitchen. blog.sondera.ai/p/agent-pbj-…

The Agent PB&J Problem

The lethal trifecta is not just a story about prompt injection. It is a story about literal execution.

blog.sondera.ai

AI Cyber Magazine

Josh Devon retweeted

AI Cyber Magazine @aicybermagazine

May 8

What separates a trustworthy AI agent from one that quietly breaks everything? Read @joshdevonai's insightful contribution in the Winter 2026 issue of AI Cyber Magazine to find out. Flip to read excerpts from his piece and visit issuu.com/aicybermagazine/do… to read the full piece

AI Cyber Magazine

Josh Devon retweeted

AI Cyber Magazine @aicybermagazine

May 8

AI Cyber Magazine

Josh Devon retweeted

AI Cyber Magazine @aicybermagazine

May 8

AI Cyber Magazine

Josh Devon retweeted

AI Cyber Magazine @aicybermagazine

May 8

NYC ALL DAY 24/7 🍎

Josh Devon retweeted

NYC ALL DAY 24/7 🍎

@nycallday247

May 3

AI Agents Happy Hour with Veris AI and Sondera AI | #AIAgentWeek2026 luma.com/y6eg09me via @LumaHQ @andiPartovi @mjamei @joshdevonai

AI Agents Happy Hour with Veris AI and Sondera AI | #AIAgentWeek2026 · Luma

Unwind after AI Agent Conference with Veris AI and Sondera AI at a happy hour at SPIN Midtown for the agent community. Join founders, builders, and leaders…

luma.com

120

Josh Devon

Josh Devon

@joshdevonai

Apr 29

If you're in NYC for AI Agent Conference next week, come grab a drink with us on May 4. We're co-hosting an AI Agents Happy Hour with @veris_ai right after Day 1 wraps. Founders, builders, and people shipping agents. #AIAgentWeek2026

Josh Devon

Josh Devon

@joshdevonai

Apr 29

RSVP 👇 luma.com/y6eg09me

AI Agents Happy Hour with Veris AI and Sondera AI | #AIAgentWeek2026 · Luma

Unwind after AI Agent Conference with Veris AI and Sondera AI at a happy hour at SPIN Midtown for the agent community. Join founders, builders, and leaders…

luma.com