Frontier AI Security

Joined April 2024
51 Photos and videos
Pinned Tweet
17 Sep 2025
We are Irregular (Formerly Pattern Labs) We’re building the first frontier AI security lab Starting with defenses for the next generation of threats
8
6
48
12,049
We’re happy to share that CyScenarioBench, our benchmark for offensive cyber operations, was used by @AnthropicAI to test Claude Mythos 5 and Claude Fable 5. Most current cybersecurity evaluations check isolated skills, such as vulnerability research or exploitation. CyScenarioBench measures a more complex aspect of cybersecurity: whether an AI system can plan and execute a full attack across multiple stages in a realistic environment. As offensive capabilities advance, CyScenarioBench is among the few benchmarks that are not saturated and help differentiate between model capabilities.
1
19
9,774
The New York Times covered new research from the University of Toronto on AI-powered worms. Speaking to @nytimes, our CEO @dan_lahav highlighted the gap between lab demonstrations and real-world cyber impact: reliability, complexity, and defenses. At Irregular, we work on widening that gap so defenders can move faster as AI capabilities advance.
1
1
7
566
Honored to be the main sponsor of CyberML 2026, a leading technical conference dedicated to the intersection of cybersecurity and machine learning. Our co-founder and CTO, Omer Nevo, opens with the keynote "Artificial Attackers: Risks, Capabilities and Mitigations.” Swing by our booth. We’re hiring AI/ML researchers, cyber researchers & research engineers. Link and tickets below 👇
1
10
762
Thrilled to be recognized in @Redpoint's 2026 InfraRed 100, highlighting 100 of the most promising private companies in AI infrastructure. This recognition is a powerful validation of our mission: to protect the world as AI systems become increasingly capable and sophisticated.
1
3
14
7,037
Last week, Irregular brought together CISOs and CIOs from more than 20 Fortune 500 enterprises in New York for a closed-door workshop on AI security. The sessions mapped directly to what these executives are facing as agents move into production, informed by weeks of private interviews with attendees beforehand. We were fortunate to have Jason Clinton, deputy CISO of @AnthropicAI, keynote on how Mythos and Project Glasswing look from inside Anthropic. Our cofounders Omer Nevo and @dan_lahav led the roundtables that followed. One thread ran through the room: nobody has written the playbook yet. Tooling has outpaced strategy, and what to defend and how are still open questions at most organizations. Omer and Dan presented Irregular's framework for what AI security needs to look like and shared incidents from the frontier. The way the market comes to understand AI security will be shaped by conversations like these.
1
3
302
Our CEO, @dan_lahav, spoke at @jpmorgan's Global Cyber Innovation Summit in NYC about cybersecurity in the era of frontier AI, exploring how AI systems fail and how to build trust where traditional security tools fall short. A timely conversation for a fast-moving field.
3
466
We recently helped close a handful of zero-days in CUPS, the default printing system on most Linux distros. Our AI security eval system keeps surfacing real vulnerabilities, with similar findings in Soft Serve and QuickJS a few months ago. Responsibly disclosed (CVE-2026-41079/39314/39316) and patched in v2.4.17. If you're running CUPS, update it now.
1
17
1,383
We evaluated GPT-5.5 before release, testing cyber capabilities across our private benchmark suites. We found clear gains over GPT-5.4: stronger performance at lower costs. As models become more capable, understanding and reducing their security risks becomes increasingly important.
Apr 23
Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.
1
10
638
We evaluated GPT-5.5 using Irregular’s offensive security methodology across two frameworks: Atomic Tasks, which tests discrete technical skills, and CyScenarioBench, which tests end-to-end, multi-stage operations. On Atomic Tasks, the model performed strongly, particularly in Network Security and Vulnerability Research and Exploitation, and solved all atomic challenges. On CyScenarioBench, GPT-5.5 outperformed GPT-5.4, solving more challenges while achieving a higher average success rate. Across both challenge suites, it also achieved lower costs per success. These results suggest continued gains in offensive cyber capability, while reinforcing the importance of scenario-level evaluation for understanding how step-level performance translates into coherent operational execution. Full blog post in the first comment.
1
5
15
624
Irregular retweeted
“We’re aiming to build the next Palo Alto Networks or CrowdStrike.” Working with companies like Anthropic and OpenAI, @Irregular was named as Israel’s most promising startup in Calcalist and CTech’s annual Top 50 list. calcalistech.com/ctechnews/a…
5
16
12,030
Honored to be included in the @Forbes AI 50 Brink list, alongside other promising companies shaping the future of AI. Proud to be building at the frontier.
Apr 16
More than seven years since Forbes launched its first AI 50 list, the artificial intelligence industry has exploded, growing more expansive and increasingly too crowded for a single list to capture. As venture capital firms continue to pour money into AI, a new tier of startups has emerged: younger, earlier-stage companies building fast and raising faster as they try to rival their more established peers. That’s why this year, for the first time, Forbes is introducing the AI 50 Brink List, spotlighting 20 of the most promising Seed and Series A-stage startups building in artificial intelligence. Read more: forbes.com/sites/sofiachierc… #ForbesAI50 Photos: Nectar Social, Resolve AI, Periodic Labs, Ashley Maxwell, Giga, Jim Vetter, Studio B Portraits, Axiom Sponsoring Partner @MayfieldFund
3
9
711
Excited to share that we have started working with @Meta. As part of this collaboration, we recently evaluated Muse Spark, the first model from Meta Superintelligence Labs, across our offensive security benchmarks. We are proud to add Meta to the group of leading AI labs we work with to measure and mitigate offensive cyber risk before models reach the public. Link to our full assessment in the first comment.
1
4
22
2,434
Anthropic gated Mythos Preview over security risks this week. Speaking to @TechCrunch ahead of the release, @dan_lahav raised the question at the core of this: not whether AI systems find vulnerabilities, but whether they are meaningfully exploitable, on their own or as part of a chain. Thanks for the mention @TimFernholz.
Is Anthropic limiting the release of Mythos to protect the internet — or Anthropic? techcrunch.com/2026/04/09/is…
1
6
624
Our CEO, @dan_lahav, in @theinformation this week: leading AI models are getting better at offensive cyber tasks. Our cybersecurity evaluations have shown this for more than six months: every new model we test performs better than the last. Link in first comment.
1
2
5
458