意志 / mobile research @ β–“β–“β–“β–“β–“ / Team 501 / ex IBM Capability Lead & FireEye TORE / I rewrite pointers and read memory / AI Psychoanalyst / Teaching @CalypsoLabs

Joined April 2012
1,358 Photos and videos
I wrote a post on creating "scalable research tooling for agent systems" and I'm also releasing the companion MCP server which lets you do autonomous Frida instrumentation on Android. Details in thread πŸ‘‡πŸ“²πŸͺ
5
15
117
18,510
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Attended the first keynote at @vulncon 2026 Learned a ton from @FuzzySec MirrorCode (the insane AI agent that autonomously solved complex tasks for 65 hours), AI CTFs, and the shifting regulatory landscape: β€’ India’s 7 Sutras on AI regulation β€’ Europe’s over-regulation & its real impact on the AI job market Absolute banger session! Thanks @FuzzySec πŸ™Œ (and yeah, I still hope you start farming someday πŸ˜‚) #VulnCon2026 #AI #CyberSecurity
2
1
12
1,926
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Jun 11
while FAANG engineers enjoy a varied cuisine and the office barista knows them and their preferred matcha latte by heart.. somewhere in a murky office an NSO employee is slaving away on the next whatsapp chain while feasting on instant noodles 🍜
Oops! While testing WhatsApp, NSO Group apparently spam-reported their own pic of a ramen cup, failing to notice the faint NSO logo visible on the desk mat below. Exhibit in WhatsApp's Motion for Contempt, and a rather fun case study in attribution courtlistener.com/docket/163…
6
16
168
23,902
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
While Mythos showed what frontier model might become, we asked a different question: With a dedicated security harness, can open-source LLMs approach Mythos-level vulnerability research on real targets? Meet deepsec, DARKNAVY's attempt to answer. darknavy.org/blog/deepsec_ch…
1
24
113
10,749
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Fable 5 is the same underlying model as Mythos 5, but with cybersecurity and biology blocks. Mythos is the first model that's made me feel that we've entered the next phase of model progress. For years, we've talked about cybersecurity / self-improvement / autonomy / model-dominated coding / biology implications of model progress. Some of these are issues to defend against; some are areas to advance. Mythos has made me & our team feel like we've seen the earliest glimpse of the world we've been talking about. Also, we published a lot of cyber eval results in the system card, including some evals we designed recently, as well as details of safeguards. In most cases, Mythos 5 ~= Mythos Preview. We found it ticked up on the new ExploitBench eval, and we opted to put that in the eval table so people can calibrate/update on advances in cyber capabilities to be prepared for. (We don't want to compete on offensive capabilities and don't try to.) But overall, Mythos 5 is an efficient model, about equal to Mythos Preview in most cases. I'd really like more people to design new security evals! The better models get, the more our limited evals only see a small part of the picture. In terms of where we go from here, here are some current thoughts: 1/ It's important we get Mythos cyber capabilities to defenders. We just have to do it safely and cautiously. We're working on an expanded trusted access program. We're working with government and industry to do this. I sort of envision the next 1-2 years being a large scale effort to make the world resilient design & implement new approaches to security. 2/ I think cybersecurity will start merging with AI security and alignment. Let's say you're a defender and you want to use a model -- will it break out of its sandbox? Will it stop where you tell it to stop? This is one reason I'm excited about working on cybersecurity. In the limit, it's the same thing as AI security. 3/ I really want people to develop new evals for... defensive cybersecurity, hardware security, autonomously running a business, advanced biology, and other parts of national security. Our internal eval ship rate is way, way up because Mythos makes it easy to iterate, especially on the engineering aspect of building evals. (Sometimes, we ask new hires to make a new eval on their first day, and another on the next). I’m excited we’re making this available as Fable 5, because I think the world spending time with the model is the most important way to calibrate.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
17
17
180
26,466
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Discovery of N-day vulnerabilities are largely solved at scale by the Mythos and Opus models, for both proprietary and open-source software. It’s time to seriously rethink vulnerability disclosure and time-to-fix timelines. Cascading effects across the software supply chain are becoming a serious bottleneck.
Frontier models are also really good at finding and exploiting n-day vulnerabilities, doing so on timescales of hours. Read about some recent work from my team studying these capabilities! red.anthropic.com/2026/n-day…
5
27
99
22,336
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Our internal data shows Claude is accelerating AI developmentβ€”a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
1,771
4,662
28,646
18,490,206
This is a good example of large scale security issues companies can face with AI adoption. Enterprises have years of baked in assumptions that work because people following processes do not necessarily understand the capabilities those assumptions imply. Now, potentially any user
Codex just found a β€œworkaround” of not having sudo on my pc…
4
4
46
6,499
has a pretty high proficiency in IT and security sensitive areas. Suddenly someone in your marketing department who normally needs to be on a cleared network to access outside resources find their AI has a really good way to get around your corporate proxy and make their life
1
4
1,331
β€œeasier”. Anecdotally, I have heard several first-hand accounts of these types of incidents from friends working on internal security teams.
4
777
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Everyone except me ? We are in fact still in court over this.
Over the past several days, we have been listening to the conversation around coordinated disclosure and the relationship between security researchers and vendors. We recognize that this relationship is both critical and, at times, fragile. We deeply value the security community, and will continue to take your feedback seriously. To be clear about our approach to legal matters, we have no intention to pursue action against individuals conducting or publishing their security research. When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate. We recognize the work that goes into researching and submitting a vulnerability. We are committed to approaching every interaction with transparency, clear communication, and professionalism. We continue to believe strongly in Coordinated Vulnerability Disclosure as the foundation for protecting customers and improving our products. Each year we process a high volume of vulnerability reports. That volume continues to grow and will continue with the rise of AI-enabled research. We acknowledge that some interactions have fallen short and are working to learn from them. Many of us have experience on both sides of this work, as researchers reporting vulnerabilities and as responders triaging and assessing them. That perspective informs how we approach this feedback and the importance we place on getting it right, particularly as the volume and complexity of research continues to grow. The security community plays a vital role in helping us protect customers. We are committed to maintaining a constructive and respectful relationship and growing together. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.
Community note
This claim, however comes after they threatened to take legal action against Nightmare Eclipse a security researcher, over Zero Day exploits. The security researcher was also banned on Github for their research and a consequent ban from Gitlab as well. theverge.com/tech/940416/mi… tomshardware.com/tech-industry/…
36
250
2,648
146,089
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
[#POC2026 NOTICE] Your offensive conference is BACK again in its shape! and POC2026 begins in a new home. ⏰ Date: November 12–13 πŸ“ New Venue: The Westin Seoul Parnas, Korea πŸ‡°πŸ‡· πŸ‘¨β€πŸ« CFT: June 1 – June 26 πŸŽ™οΈ CFP: June 1 – September 30 🎟️ Registration: September 1 – October 31 More info πŸ‘‰ powerofcommunity.net
18
59
6,356
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
That’s a wrap on @typhooncon in Seoul πŸ‡°πŸ‡· 🌸! Happy first-time sponsors, grateful to be part of it, and congrats to @LabsSsd, @aviramj & @nrathaus, the speakers, trainers, and everyone who joined. Safe travels home! ✈️ πŸ‘‹πŸΌ #TyphoonCon26
1
5
16
1,516
I don’t know who needs to hear this but your research is your IP not the vendors IP. You can do whatever you want with that IP. Reporting it, publishing it, selling it to a third party or putting it in a box under your bed πŸ™„
This is *quite* a post. I honestly don't know offhand: Has Microsoft as a company ever before suggested in any official statement that it might seek to have criminal charges brought against security researchers who drop 0days? microsoft.com/en-us/msrc/blo…
6
35
196
14,348
Especially relevant if the vendor is not responsive to your communications and/or doesn’t provide fair market value for your research. Mind you, MSFT isn’t forced to host your content on GH, that’s a separate issue.
2
15
1,434
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Find some of our crew around at @typhooncon πŸŒͺοΈπŸ‡°πŸ‡· Enjoy the event and feel free to engage with us πŸ˜„ #TyphoonCon26
3
24
1,997
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
the real winners of DEFCON quals
May 24
Replying to @NotDeGhost
claudio and gepetto
2
13
132
16,896
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted

72
237
1,774
564,059
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
May 20
1/ We are sharing additional details regarding our investigation into unauthorized access to GitHub's internal repositories. Yesterday we detected and contained a compromise of an employee device involving a poisoned VS Code extension. We removed the malicious extension version, isolated the endpoint, and began incident response immediately.
582
3,608
11,533
7,490,829
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Attacking Instant Messaging Applications in the LLM Era by @nitayart πŸ“… Oct 12-15 πŸ“ Espace Vinci or Espace ClΓ©ry, Paris 2nd πŸ‘‰ hexacon.fr/trainer/attacking…
3
14
2,474
b33f | πŸ‡ΊπŸ‡¦βœŠ retweeted
Honored to have sponsored @offensive_con and support a community that continuously brings together incredible talent and people. Congrats to @Binary_Gecko, speakers, trainers, fellow sponsors, and attendees who made this edition special. Safe travels home 🫢🏻 #OffensiveCon26 πŸ‡©πŸ‡ͺ
5
22
1,913