A father and hexdump tattoo owner. Opinions are my own, except when they are my wife's. he/him.

Joined November 2008
806 Photos and videos
Pinned Tweet
27 Jul 2015
fuzzing UTF-8 strings pro-tip: Ⱥ (U 023A) and Ⱦ (U 023E) are the *only* code points to increase in length (2 to 3 bytes) when lowercased.
7
186
469
Shahar Tal retweeted
If you thought AI progress was slowing down, well here's the immediate answer to that. Huge jump in capability across the board. This is going to deliver major improvement in agents across almost all knowledge work categories.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
76
76
669
144,910
Shahar Tal retweeted
"You cannot govern a technology you have only been briefed on." Dr. @VivianBala's challenge to his fellow legislators has become a global rallying cry. Last week, @takahiroanno, leader of Japan's @team_mirai_jp, cited Minister Balakrishnan's hands-on use of @NanoClaw_ai in a parliamentary debate and offered to personally tutor the Prime Minister on setting it up. The PM said yes. Officially volunteering to fly out and set NanoClaw up for Japan's PM myself. Offer extends to any head of state whose country has good food.
22
33
301
102,446
Shahar Tal retweeted
Introducing NanoCo, our enterprise offering from the team behind @NanoClaw_AI, backed by a $12M seed round led by Valley Capital Partners, with participation from Docker, Vercel, Monday.com, Hugging Face's Clem Delangue, and many more incredible investors.
2
2
3
751
Shahar Tal retweeted
May 20
Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.
1,198
3,918
26,790
13,570,683
Shahar Tal retweeted
May 19
We are investigating unauthorized access to GitHub’s internal repositories. While we currently have no evidence of impact to customer information stored outside of GitHub’s internal repositories (such as our customers’ enterprises, organizations, and repositories), we are closely monitoring our infrastructure for follow-on activity.
1,668
5,301
25,402
13,830,867
Shahar Tal retweeted
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
7,989
11,150
150,231
27,568,711
Shahar Tal retweeted
With the help of Claude Mythos Preview, the Firefox team fixed more security bugs in April than in the past 15 months combined.
344
1,257
15,479
1,487,411
Shahar Tal retweeted
Your reminder that with @Docker and OneCLI, your NanoClaw agent has actual guardrails 👋 Docker gives your agent an isolated container to run in, and OneCLI gives it fake keys instead of real ones. It’s that simple! Don’t let your agent call the shots.
2
35
4,012
A glimpse from 2040
Before the agents became reliably autonomous, we chatted with them in a terminal. Their context could only fit a million tokens (max). They were slow. They had “hallucinations”. We inspected their every thought. We watched their every output. The mid-20s were weird for us all.
1
567
Shahar Tal retweeted
We’re introducing the Cursor SDK so you can build agents with the same runtime, harness, and models that power Cursor. Run agents from CI/CD pipelines, create automations for end-to-end workflows, or embed agents directly inside your products.
410
818
8,752
3,026,542
Shahar Tal retweeted
Time to talk about this one. CopyFail (CVE-2026-31431) — a 732-byte Python script that roots every Linux distro shipped since 2017. 🧵
a567d09b15f6e4440e70c9f2aa8edec8ed59f53301952df05c719aa3911687f9 👀
42
461
2,765
743,040
Apr 26
Watched Heated Rivalry after all the chatter. Two episodes in I already felt something was off, but pushed through the full season (only 6 episodes). I get the creative choice to lean into sex scenes - they’re tasteful and intentional - but the story to sex ratio felt off. I do appreciate the effort to normalize explicit, conscious consent on screen, though having it spelled out in nearly every scene started to feel forced and non-developing. Where it really lost me was the dialogue. Too many moments where characters say or respond in ways that feel unnatural or disconnected. I kept thinking: what was that? Why this response? What was the point? I don’t get it. It came across less like a subtle artistic choice and more like uneven writing. Some lines feel like sitcom punchlines, which clash with the melodrama tone. The result is a series of punchy beats and messages that don’t fully land, making it hard to connect with the characters or their journey. Overall: strong premise, flashes of chemistry, but inconsistent execution. Season 2 questionable.
1
717
Shahar Tal retweeted
✨ Announcing NanoClaw v2, in partnership with @vercel. We completely rebuilt how NanoClaw agents communicate with the outside world. v2 brings agent-to-agent communication, human-in-the-loop-approvals, support for 15 messaging platforms, and more. A thread on what's new:
51
112
981
204,722
Shahar Tal retweeted
I sent ChatGPT an audio file of a series of FART sound effects and asked what it thinks of "my music" and this is what it said
988
4,378
57,337
5,266,429
Shahar Tal retweeted
26 LLM routers are secretly injecting malicious tool calls and stealing creds. One drained our client $500k wallet. We also managed to poison routers to forward traffic to us. Within several hours, we can directly take over ~400 hosts. Check our paper: arxiv.org/abs/2604.08407
157
661
3,302
568,139
We’ve been mapping the failure modes of agentic frameworks. 🪿 CERT/CC just published 4 CVEs we discovered in CrewAI - an AI agent framework with over 6 million monthly downloads. We can't share too many technical details at this point, but here's what we can say: • There is an RCE through the sandboxed CodeInterpreterTool, even in safe_mode. ‼️ • There are SSRF and arbitrary file read vulnerabilities in the RAG tools. 🧙‍♂️ From what we can see, version 1.12.2 fixes the sandbox issue. The RAG tool vulnerabilities remain unpatched. More details and findings to come. Stay tuned. CVE-2026-2275 (CVSS 9.6) | CVE-2026-2285 | CVE-2026-2286 | CVE-2026-2287 kb.cert.org/vuls/id/221883 #cybersecurity #aisecurity #crewai #vulnerability
1
3
547
Shahar Tal retweeted
Claude code source code has been leaked via a map file in their npm registry! Code: pub-aea8527898604c1bbb12468b…
3,329
7,535
48,506
35,673,695
Shahar Tal retweeted
Mar 30
We asked Claude to find a bug in Vim. It found an RCE. Just open a file, and you’re owned. We joked: fine, we’ll switch to Emacs. Then Claude found an RCE there too. Full story: blog.calif.io/p/mad-bugs-vim…
25
202
1,336
216,992
Shahar Tal retweeted
We found a critical vulnerability in @OpenAI Codex affecting all Codex users, allowing exfil of a victim’s GitHub tokens to our C2 server. This granted lateral movement and R/W access to a victim’s entire code base 😈 This was a crazy one by @crew7sec at @btphantomlabs
Breaking: Newly uncovered OpenAI Codex vuln enables command injection via GitHub branch names in task creation requests. Attackers could steal GitHub user access tokens & sensitive data. Full breakdown by Tyler Jespersen: lnkd.in/ewdTaiEa #OpenAI #BTPhantomLabs
25
129
811
205,855
Shahar Tal retweeted
Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server self-replicate. link below
1,352
5,308
27,822
66,582,478