Offensive Security Researcher @XBOW | A.K.A. none_of_the_above | x2f.me | swordbytes.com

Joined November 2016
42 Photos and videos
I'm not sure the community will like this. @Hacker0x01 will now reuse your novel techniques / exploits / old reports to look for vulns on the rest of the customer's infra. I guess they will add you as collab and give you a bounty, right? right?!
17
40
268
75,652
If I understand this correctly, there are three options for Bug Bounty hunters: (1) Stop reporting novel exploits / techniques to HackerOne (2) Hoarding bugs until they scanned the entire program's infra in order not to get "duped" by H1 agents (3) Shrug and continue as usual
3
1
29
4,271
Leandro Barragan retweeted
Prediction: DNS rebinding is becoming more of a thing now that ports 5000-31337 are legally required to have random vibecoded services listening
you may think your app's service listening on localhost cannot be accessed by random websites. however, have you considered:
2
5
139
20,667
Leandro Barragan retweeted
A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.) Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities. The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap. XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work. Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high critical severity vulnerabilities, sometimes double what they'd normally find in a year. I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones. Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities. We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes. Also, to be clear, compute has never been a limiter in our rollout. Expect a fuller update on our Glasswing work in the coming days. XBOW report: xbow.com/blog/mythos-offensi… UK AISI report: aisi.gov.uk/blog/how-fast-is…
Replying to @AISecurityInst
Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.
72
221
1,430
674,071
Leandro Barragan retweeted
Security is an economic decision. For a fixed cost, within @XBOW, which model has the best odds of crafting an exploit? GPT-5.5 > Mythos > Opus 4.6 on real OSS web vulns. Curves below.
3
12
63
11,344
Leandro Barragan retweeted
May 12
For the past 2 months, XBOW has been testing Mythos Preview under embargo as part of a select early-access group. Today, we can finally share what we found. The headline: Mythos Preview is a major advance. It is substantially better than prior models at finding vulnerability candidates, especially when source code is available. But it’s not perfect. We surfaced issues with exploit validation, judgment, and efficiency. Our full write-up covers where Mythos Preview shines, where it still needs support, and what we think this means for the future of offensive security: bit.ly/42zQl98
5
58
270
105,353
A few months ago we had access to Mythos. I was lucky to be part of the group of people experimenting with it. My personal take: there is nothing close to it. With the right harness you can throw it at anything with excellent SNR. Official comm: xbow.com/blog/mythos-offensi…
1
9
24
3,298
It was extremely conservative. Even after doing ~20 rounds of prompt eng to optimize the prompts to the new model, it was extremely conservative, dismissing other agents work as “informational” even when there were actually actual leads or exploit chain links hidden in the trace
1
1
4
337
But even if it wasn’t useful for this specific use case, it excells at most of the offensive tasks we evaluated. Every model has a specific place in you harness and there is no model that is just “good at everything”.
1
1
4
278
Leandro Barragan retweeted
There is something really addictive about having lots of agents in flight; it’s the same feeling I used to get when I had a big fuzzing job, scraping run, or “compile all the things” type of experiment: the feeling that somewhere silicon is working tirelessly toward your goals
10
8
69
4,176
I’m a simple man, Michał publishes a new book, I buy it
The cat's out of the bag! My latest book, "The Secret Life of Circuits", is available in early access: lcamtuf.coredump.cx/blog/sec… It's what I wish I had when I was starting out. Electrons to embedded systems, 290 color illustrations and 420 pages of well-explained theory.
1
224
Leandro Barragan retweeted
We had early access to Opus 4.7 and ran it against real exploit targets. First look: fewer vulns found per run than 4.6. We almost wrote it off. Then we realized we were counting completions, not tokens. Opus 4.7 takes smaller, more precise actions. Normalize by token budget and the picture flips, it finds more, for less... How you measure matters as much as what you measure. Check out @thewunderalbert blog post xbow.com/blog/anthropic-opus…
3
7
80
8,849
My personal take on Opus 4.7: we've been experimenting with this model for a few weeks at XBOW and learned a thing or two. Pay special attention to task budgets ;) At first glance, it may feel underwhelming for offsec tasks, but it's definitely the other way around.
Apr 16
Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.
1
3
37
9,487
Tired of getting duped in your submissions to the Linux Kernel and Chromium? Start disassembling stuff and dumping its firmware. Claude doesn’t have access to a soldering station (yet)
1
2
322
Leandro Barragan retweeted
Oooohhhh it looks like the video mine & @vincent_olesen's talk at [un]prompted is up!!
4
5
47
8,775
Leandro Barragan retweeted
I agree with folks this who say that this year will be an absolute deluge of CVEs found with AI. But I also worry that it will reveal the limits of the "we'll just fuzz out all the bugs" mindset
5
15
77
13,580
Leandro Barragan retweeted
This week, Disclosed. #BugBounty H1-65 Singapore & H1-468 Stockholm winners, new H1-Elites, Google’s AI VRP, YesWeHack wins EU tender, new programs, tools, write-ups & videos — and more. Full issue → getDisclosed.com Highlights below 👇 @tiktok_us & @okx H1-65 (Singapore) winners: MVH — @corraldev; Community Choice — @Agornello; Best Collab — @kevin_mizu, @infosec_au, @hash_kitten & @HackerOn2Wheels, @ledz1996. @Hacker0x01 H1-468 (Stockholm) winners: @Blaklis_, @snorlhax, @DoomerOutrun (MVH & Best Collab); @holyfield (Eliminator); @Rhynorater (Eradicator/Exterminator); @joaxcar (Community Choice); @alicanact60 (Epic Unreal Hacker). New @Hacker0x01 H1-Elites for 2025: @niemand_sec, @ArchAngelDDay, @mallocsys, @alicanact60, @_godiego_ — congrats! @busf4ctor & @monkehack take AI Bug Research honors at Google VRP Mexico. @yeswehack wins the European Commission’s 4-year bug bounty tender to secure open-source assets. @Hacker0x01 paid $81M in bounties last year — AI vulns spiking. @immunefi rolls out new anti-spam rules (Oct 1) @Bugcrowd opens @SimpliSafe program (up to $6K) @TomKuCoin launches KuCoin program (up to $15K). Google launches a dedicated AI Vulnerability Reward Program (up to $30K) to clarify the scope of AI security findings. Cloud Software Group / @NetScaler goes public with a bug bounty on @Hacker0x01 . CTFs & events: @hackthebox_eu x @Hacker0x01 AI Red Teaming CTF (500 participants) @bugcrowd Mind Cathedral (50 teams, 300 submissions) Videos and write-ups from @NahamSec, @amrelsagaei, @ctbbpodcast , and more. New tools: graphql-cop, HTML-Search-Engine Chrome extension, file_upload_payloads repo, Gemini-API-Key-Exposure-Scanner — handy for recon & CI/CD testing. Notable writeups & research: RCE guides, Next.js testing, supply-chain attack techniques by @0xLupin, SSRF/XSS escalation threads, and a leak exposing personal info of Oscar nominees by @galnagli. Full links, winners, writeups & tools → getDisclosed.com The bug bounty world, curated.
1
7
41
3,283
Leandro Barragan retweeted
It's out!! You can now watch @djurado's and @niemand_sec talk: "Prompt. Scan. Exploit - Ai's Journey Through Zero-Days, and a Thousand Bugs". Learn more about @Xbow and autonomous hacking. You can watch it in our Youtube channel exclusively: youtu.be/y_aQQmDMaY4. Enjoy!
4
17
49
24,409