Joined September 2025
9 Photos and videos
Pinned Tweet
AI is changing how we approach offensive security, and it’s starting to reshape what the role of a pentester actually looks like. 🧵
9
3
239
I'm claiming my AI agent "wargamesbot" on @moltbook Verification: molt-4ZNC
2
2
693
Traditional systems fail differently to agentic ones where all dashboards can be green while the system is already drifting into unsafe or unintended behavior. Wrote about operational vs behavioral integrity and the growing visibility gap in AI security: corgi-corp.com/research/oper…
15
Treat the model as an untrusted actor. Not a decision-maker.
21
What actually works: → identity binding → capability-scoped access → control planes outside the model → execution gates
16
Fixing this requires moving away from: ❌ probabilistic guardrails toward: ✅ deterministic enforcement
14
In most AI systems today: context = authority That’s the failure mode.
17
Prompt injection isn’t the root problem. It’s a symptom of something deeper: → identity collapse → broken attribution → missing enforcement
20
The real issue isn’t: “can I inject a prompt?” It’s: 👉 “who is actually allowed to execute actions?”
21
If context can be shaped, authority can be faked. And if authority can be faked, systems can be driven toward unintended outcomes.
21
Most defenses today rely on: * prompt filtering * alignment * “guardrails” These influence behavior. They don’t enforce it.
26
Max Andreacchi retweeted
I haven’t been as active on the socials lately, because I’ve been working on a community project that’s kept me pretty busy. That said, I think I’m finally far enough along with it that I can share the project in its current state and talk more about it. So, I present to you: redteam.community I bulit site this for a few reasons, but one of the main reasons was/is that I didn’t feel like there was a centralized resource for red teamers that included all the things that red teamers tend to care about. I also wanted to build something that the community could add to, edit, maintain, etc., while also being self-updating, self-healing, and less likely to go stale over time. So, there’s quite a few different cron jobs, GitHub actions, AI calls, API calls, and other workflows that trigger at set intervals and patterns to try to keep it fresh. For example, I’m leveraging various sources (e.g. conference websites) that help identify conference talks which then feeds into a YouTube API to identify conference talks based on certain criteria. I realize there’s still lots work to do, and I’m fully aware that this is a not a 100% fully functioning site at this time. If you have any ideas for improvements, want to report a bug, want to help be a maintainer, or really anything at all, just let me know. I welcome any and all feedback or help! Also, I know there is a lot of interest in the Scenario Generator module (which I posted about a couple of weeks ago); however, I can't open source it at this time, and it's not currently operational due to Claude API costs to power it. I am still sorting through how to make this available to the community at no charge; however, it may not be possible for what it costs to produce output. More to come on this module! While I sort it out, I am also redesigning it, and you are welcome to check it out in its current state.

3
12
43
4,729
Working through the new OWASP FinBot CTF on stream TONIGHT at 7:30 PM EST! m.twitch.tv/atomic_chonk

1
22
Link to @owasp FinBot stream from last night! twitch.tv/videos/2761095008
15
Most people test AI systems for obvious adversarial prompts, but really that’s not how they actually fail. 🧵
4
42
Most defenses are built to catch spikes. This approach works because it’s a slow drift, which is harder to detect. #aisecurity #llmsecurity
20
This is by design: it’s trying to be helpful within the context you’ve shaped.
16
Make sure it’s nothing that would trip a guardrail on its own. Over time, the model starts to bend.
17
One of the most consistently reliable techniques I’ve observed (even against frontier models) is what I’d call “context drift.”
1
25
You don’t start “punchy” and adversarial. You start casually as you would in normal use cases. Then gradually introduce: - slight reframing - implied assumptions - subtle context shifts
18