AI Architect @qventus Prev LITS and Partnerships @llama_index, Messaging Apps @Apple, HFT @ GETCO, @Citadel

Joined May 2008
349 Photos and videos
Pinned Tweet
GPT-5.4: the most RLed joke teller yet. @Ed_Miliband levels of consistency.
3
774
AI Acceleration in a single chart.
Replying to @AnthropicAI
Today, Anthropic engineers on average ship 8x as much code per quarter as they did compared to 2021-2025.
2
1
174
Yi Ding -- prod/acc retweeted
Benchmarks place GPT 5.5 as the best model on SWE, but is it the best at making apps end-to-end? Turns out Opus 4.8 continues to be the king of vibe coding on both price & performance. Introducing ViBench: the first benchmark for app creation based on real world tasks
100
64
806
64,547
Yi Ding -- prod/acc retweeted
4,226 voices have already been recorded across 147 Humans in AI Week events happening around the world this week. I do not think I would have believed that three years ago. In early 2023, a small group of us were sitting in a cozy San Francisco apartment trying to make sense of ChatGPT, AGI, and what this all meant for the world. The technology felt historic. The thing I remember most was the feeling in the room. People were thinking out loud. Changing their minds. Admitting confusion. Getting excited. Getting scared. Finding language together. That night became @AICollectiveCo. Three years later, this community has grown to 250,000 members, 200 chapters, 1,700 events, and 650 volunteer organizers around the world. This week, that same instinct is becoming Humans in AI Week: June 1–7, 2026, across 100 cities and 50 countries. A global time capsule for the AI era. The question is simple: What does it mean to be human in the AI era? The answers are messy. That is the point. Some people are excited. Some are scared. Some feel behind. Some feel superpowered. Most of us are carrying a few contradictory feelings at once. That is why I still believe so much in rooms full of real people. Online, AI discourse collapses into extremes. In person, the temperature changes. People listen longer. The anxious parts get named with more care. The optimistic parts become less abstract. You remember there is a person behind every position. That has always been the magic of this community. We can gather frontier builders, curious newcomers, artists, students, founders, policy people, educators, and skeptics into the same conversation, then let the room do what the internet usually cannot: slow people down enough to hear each other. Humans in AI Week feels like the culmination of 3 years of learning how to create those rooms. For me personally, this is also a handoff moment. I stepped back from leading The AI Collective because the community had become bigger than any one person, and because AJ, Catherine, and the team were ready to carry it into its next chapter. Watching them turn the original spark into something this global is SO special. Deeply grateful to @AJs_AI, @catrosemcmillan, our chapter leads, our volunteers, our partners, and every person showing up or adding their voice this week. This is what we built the community for. Onwards and upwards!! 🚀
2
8
12
597
Something I've seen in many Chinese LLMs is they generate sentences with mixed Chinese and English. First time I've seen this in Claude, which makes me think they've been training on (advertently or inadvertently) Chinese model output. 真=real
1
4
283
My addition to this RLM discourse is this reply from Omar in Jan At the time the paper came out some of us were questioning whether RLMs were just what harnesses like Claude Code were already doing but Omar had an important point that they were missing recursion. With Dynamic Workflows Claude Code finally has built in recursion, albeit only one level. That's great news for us practitioners because we can now coin new terms like RLM 2.0, GraphRLM, Hybrid RLM, Agentic RLM and so on.
Totally. The PLAN․md pattern coding environment recursion are all you need to have a complete RLM! Turns out that these three pieces together give you a extremely general and strong inference scaling axis for handling what appear to be arbitrarily long prompts. That's all!
1
3
85
12,949
Btw, it's kind of funny that the LLMs want to write Typescript so much Anthropic has to explicitly tell them not to. #FreeTheTypes @alexyang
So the super impressive thing about dynamic workflows that people are sleeping on is that it isn't "deterministic." It's literally just a prompt, albeit a fairly detailed one, teaching the agent to write a graph-like description in Javascript. A lot of people thought this kind of thing would be possible 3 years ago, but were too early. The feature currently has a lot of manual tuning (in the same way Deep Research did when it was first released), but it's still super impressive to see the dream become a reality.
1
351
Yi Ding -- prod/acc retweeted
I remember hearing @karpathy himself say that autonomous agents would, like self driving cars, take a decade to work at human-like levels. To see the Claude Code team deliver it in 3 years is both mind-blowing and just a testament to the continued exponential trajectory.
1
2
5
1,528
So the super impressive thing about dynamic workflows that people are sleeping on is that it isn't "deterministic." It's literally just a prompt, albeit a fairly detailed one, teaching the agent to write a graph-like description in Javascript. A lot of people thought this kind of thing would be possible 3 years ago, but were too early. The feature currently has a lot of manual tuning (in the same way Deep Research did when it was first released), but it's still super impressive to see the dream become a reality.
May 28
someone hit me up about the new "claude dynamic workflows" feature, claiming "see, multi-agent works" But really, the launch of this feature proves the exact point that I made back in June of 2025, along with @walden_yan, @tobi, @karpathy, and many others: Deterministic workflows orchestrating small agent loops beats non-deterministic multi-agent or "agent soup" systems every dang time everything is context engineering
8
18
281
40,386
I remember hearing @karpathy himself say that autonomous agents would, like self driving cars, take a decade to work at human-like levels. To see the Claude Code team deliver it in 3 years is both mind-blowing and just a testament to the continued exponential trajectory.
1
2
5
1,528
Claude Workflows have a built in reviewwiggum function called loop-until-dry:
1
2
24
2,334
Yi Ding -- prod/acc retweeted
Claude Code is finally an RLM (oct 2025), congrats to Anthropic :-)
New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.
18
42
491
79,029
Yi Ding -- prod/acc retweeted
May 28
Super excited to finally share Dynamic Workflows in Claude Code!! We built this a couple months ago, and it has slowly become a daily driver for a bunch of people at Anthropic. A few tips for getting the most out of it 🧵 x.com/ClaudeDevs/status/2060…

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.
97
170
2,462
491,681
It looks like workflows are more powerful thank I thought. Dynamic workflows allow workflows to be loops, etc. Which is probably why they're defined as JS rather than just a JSON DAG or something.
1
157
So I thought Claude Code workflows were orchestrated via some kind of JSON/YAML specification but turns out they're defined using JS!
1
1
427
The workflow viewer in Claude Code could really use some faster scrolling commands (4j/G/etc.) @amorriscode
2
92
Claude Code now helps you be a better prompter: Prompt engineering is dead. Long live prompt engineering.
1
152
Opus 4.8 does substantially better on HealthBench Professional (benchmark from OpenAI) than previous models. 5.5's corresponding score is 51.8* *Claude models in this graph are graded with Sonnet 4.6 whereas 5.5 is graded with 5.4/low.
167
npm install -g @anthropic-ai/claude-code@2.1.154
1
1
67
Using Opus 4.8, seeing this new error message. Anybody have an idea why they made this change?
2
227