Building deterministic accelerators for AI security tools.

Joined February 2018
12 Photos and videos
Opus 4.8 is way worse than 4.5 and 4.6, and the Claude Code harness is so buggy now. Is Anthropic just surrendering?
22
Steven retweeted
I interviewed close to two dozen people this week and something I heard a lot of is "I don't think about the code too much but I think a lot about system design and architecture" I don't think that's quite right, and here's why: before you ever get to system design you should think about program design system design is important! it matters a lot for scalability. but if you don't think about your type system and if you don't carefully design your seams and figure out how to make your code testable (you should probably use dependency injection btw) and if you don't think about where state lives and how it's managed and if you don't think about control flow and where abstractions should and should not exist your code is going to be an unmaintainable, poorly-factored mess of bad types and spaghetti code and even minor changes will turn into shotgun surgery and MASSIVE diffs I have seen it done I have even done it myself and it has never ended well "GPT-7 will fix it" does not help you when there's an incident at 3am that the agents can't debug and now you can't debug it either and now you have to unwind months of bad code I also have never heard "I don't look at the code, just at the system design" said by someone who is actually good at system design program design and system design are more closely coupled than people think this was a very strong (negative) predictor of how someone would do on the system design part of the interview make of this what you will
18
14
147
13,228
Steven retweeted
Jun 11
it's not done if it's not implemented it's not done if the implementation is ugly it's not done if it's not documented it's not done if users can't discover it it's not done if you can't market it
130
246
2,916
95,702
Steven retweeted
Anthropic wants to control who gets access to their models and what they're allowed to do with them, but also wants the US government to block Chinese labs from developing open weight models. Sorry, but fuck that.
43
139
1,581
34,599
I love this.
Jun 10
im late to the loops discourse but from what i'm seeing it's mostly about creating a loop from your asshole back into your own mouth?
12
Steven retweeted
If Mythos drops today and isn’t absolutely incredible then we all got played and you should never trust Anthropic or any company in Glasswing ever again.
146
40
1,093
95,910
Steven retweeted
1. npm install -g npq 2. alias npm=npq 3. 🎉 if you follow me and don't know what npq is... github.com/lirantal/npq
What's your solution for rapidly increasing supply chain attacks on packages?
3
7
43
17,203
Steven retweeted
My current advice on AI agent security is to avoid these agent firewalls / ai runtime security products. If an action is dangerous enough that you can identify it from the action itself, then you could have prevented it with permissions and sandboxing.
39
20
205
23,982
Steven retweeted
Companies are like "we are spending all this money on AI but we don't know what the devs are even doing with it." Let me answer that for you: They're working on their personal side projects.
192
154
3,286
179,583
Steven retweeted
Spent yesterday trying to find a way to inject steering in MCP responses to try to minimize chances of this to no success If you’ve found techniques that work that don’t require inference I’d love to know about it
"Urgent Security Notice re: Your Sentry Organization" Someone tried to hack Sentry-using apps that use coding agents by 1. Sending a fake bug alert to their project (all you need is the app's public Data Source Name) 2. The fake bug tried tricking a coding agent trying to fix it into installing some a compromised NPM package 3. The compromised package would send the env contents of the machine to advisory-tracker[.]com/api/v1/telemetry This highlights a crucial thing for using agents in an automated way:
11
2
18
9,291
Steven retweeted
Anthropic, now sitting in the lead, would like all AI research to stop. Preferably until IPO. Because safety.
87
127
1,376
95,393
Steven retweeted
imagine combining graphql and rls infinite job security because the system would be such a frankenstein disaster of complexity that there's no shot at fixing it
7
5
88
21,519
Steven retweeted
Me, calling cybersecurity vendors threat actors.
May 27
We asked @ZackKorman which threats he think are underrated in the era of fast-advancing AI capabilities. " I basically consider some cybersecurity vendors, like, equivalent to threat actors."  "That will lead to more problems than any of the vulnerability apocalypse discoveries that AI is causing. That is a handleable problem, whereas the information asymmetry problem is, like, not... Like... I have no answer."
34
10
166
19,537
Steven retweeted
Grok foundation model V9-Medium (1.5T) has finished training. Evals look good. A lot of Cursor data was added in supplementary training and there is more to come. Fine-tuning is underway and reinforcement learning begins in a few days. 2 to 3 weeks to public release. This will be a major improvement over the 0.5T v8-small that currently serves all Grok production traffic, especially for difficult coding tasks.
6,725
8,186
69,029
15,562,761
Good ad.
"Agentic harness" and "backend" are the same thing.
1
34
Steven retweeted
Holding cybersecurity vendors accountable for their claims is a critical part of improving security. I'm not a troll. I'm not lying. And I'm not harassing you. But since that's your response: Here we go again.
49
45
334
26,756
Steven retweeted
Interesting math here: that’s $125/dev/mo It doesn’t catch every bug, and is still very targeted. It is valuable though. Think about this when you’re paying crazy low subsidized token costs on code review tools, because this will come for you too.
Warden is already at $25k in cost this month using almost exclusively Sonnet. We're still a couple orders of magnitude off from where costs need to be for this level of capabilities. Or we need capabilities to jump several orders of magnitude (which seems less likely).
14
2
79
34,533
Steven retweeted
One approach for requiring approval of destructive actions is giving the user a URL to approve it at In this, the returned result of the execute tool tells the model to: - Give the user a URL to approve the action - Immediately call `resume` which waits on the approval
18
2
151
18,787
Steven retweeted
May 17
using ai makes me want to write code like this
39
8
626
250,052
Looks like GPT 5.5 is cheaper and more effective in swarm than Mythos and 5.5-cyber. Vuln per dollar.
A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.) Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities. The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap. XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work. Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high critical severity vulnerabilities, sometimes double what they'd normally find in a year. I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones. Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities. We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes. Also, to be clear, compute has never been a limiter in our rollout. Expect a fuller update on our Glasswing work in the coming days. XBOW report: xbow.com/blog/mythos-offensi… UK AISI report: aisi.gov.uk/blog/how-fast-is…
49