Yi Ding -- prod/acc

Yi Ding -- prod/acc

349 Photos and videos

Tweets

Pinned Tweet

Yi Ding -- prod/acc

@yi_ding

Mar 5

GPT-5.4: the most RLed joke teller yet. @Ed_Miliband levels of consistency.

774

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

Jun 4

AI Acceleration in a single chart.

Anthropic

@AnthropicAI

Jun 4

Replying to @AnthropicAI

Today, Anthropic engineers on average ship 8x as much code per quarter as they did compared to 2021-2025.

174

Amjad Masad

Yi Ding -- prod/acc retweeted

Amjad Masad

@amasad

Jun 3

Benchmarks place GPT 5.5 as the best model on SWE, but is it the best at making apps end-to-end? Turns out Opus 4.8 continues to be the king of vibe coding on both price & performance. Introducing ViBench: the first benchmark for app creation based on real world tasks

100

806

64,547

Chappy Asel

Yi Ding -- prod/acc retweeted

Chappy Asel

@chappyasel

Jun 3

4,226 voices have already been recorded across 147 Humans in AI Week events happening around the world this week. I do not think I would have believed that three years ago. In early 2023, a small group of us were sitting in a cozy San Francisco apartment trying to make sense of ChatGPT, AGI, and what this all meant for the world. The technology felt historic. The thing I remember most was the feeling in the room. People were thinking out loud. Changing their minds. Admitting confusion. Getting excited. Getting scared. Finding language together. That night became @AICollectiveCo. Three years later, this community has grown to 250,000 members, 200 chapters, 1,700 events, and 650 volunteer organizers around the world. This week, that same instinct is becoming Humans in AI Week: June 1–7, 2026, across 100 cities and 50 countries. A global time capsule for the AI era. The question is simple: What does it mean to be human in the AI era? The answers are messy. That is the point. Some people are excited. Some are scared. Some feel behind. Some feel superpowered. Most of us are carrying a few contradictory feelings at once. That is why I still believe so much in rooms full of real people. Online, AI discourse collapses into extremes. In person, the temperature changes. People listen longer. The anxious parts get named with more care. The optimistic parts become less abstract. You remember there is a person behind every position. That has always been the magic of this community. We can gather frontier builders, curious newcomers, artists, students, founders, policy people, educators, and skeptics into the same conversation, then let the room do what the internet usually cannot: slow people down enough to hear each other. Humans in AI Week feels like the culmination of 3 years of learning how to create those rooms. For me personally, this is also a handoff moment. I stepped back from leading The AI Collective because the community had become bigger than any one person, and because AJ, Catherine, and the team were ready to carry it into its next chapter. Watching them turn the original spark into something this global is SO special. Deeply grateful to @AJs_AI, @catrosemcmillan, our chapter leads, our volunteers, our partners, and every person showing up or adding their voice this week. This is what we built the community for. Onwards and upwards!! 🚀

597

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

Jun 2

Something I've seen in many Chinese LLMs is they generate sentences with mixed Chinese and English. First time I've seen this in Claude, which makes me think they've been training on (advertently or inadvertently) Chinese model output. 真=real

283

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 30

My addition to this RLM discourse is this reply from Omar in Jan At the time the paper came out some of us were questioning whether RLMs were just what harnesses like Claude Code were already doing but Omar had an important point that they were missing recursion. With Dynamic Workflows Claude Code finally has built in recursion, albeit only one level. That's great news for us practitioners because we can now coin new terms like RLM 2.0, GraphRLM, Hybrid RLM, Agentic RLM and so on.

Omar Khattab

@lateinteraction

Jan 12

Replying to @yi_ding @alex__mackenzie @a1zhang

Totally. The PLAN․md pattern coding environment recursion are all you need to have a complete RLM! Turns out that these three pieces together give you a extremely general and strong inference scaling axis for handling what appear to be arbitrarily long prompts. That's all!

12,949

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 29

Btw, it's kind of funny that the LLMs want to write Typescript so much Anthropic has to explicitly tell them not to. #FreeTheTypes @alexyang

Yi Ding -- prod/acc

@yi_ding

May 29

So the super impressive thing about dynamic workflows that people are sleeping on is that it isn't "deterministic." It's literally just a prompt, albeit a fairly detailed one, teaching the agent to write a graph-like description in Javascript. A lot of people thought this kind of thing would be possible 3 years ago, but were too early. The feature currently has a lot of manual tuning (in the same way Deep Research did when it was first released), but it's still super impressive to see the dream become a reality.

351

Yi Ding -- prod/acc

Yi Ding -- prod/acc retweeted

Yi Ding -- prod/acc

@yi_ding

May 29

I remember hearing @karpathy himself say that autonomous agents would, like self driving cars, take a decade to work at human-like levels. To see the Claude Code team deliver it in 3 years is both mind-blowing and just a testament to the continued exponential trajectory.

1,528

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 29

dex

@dexhorthy

May 28

someone hit me up about the new "claude dynamic workflows" feature, claiming "see, multi-agent works" But really, the launch of this feature proves the exact point that I made back in June of 2025, along with @walden_yan, @tobi, @karpathy, and many others: Deterministic workflows orchestrating small agent loops beats non-deterministic multi-agent or "agent soup" systems every dang time everything is context engineering

281

40,386

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 29

1,528

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 29

Claude Workflows have a built in reviewwiggum function called loop-until-dry:

2,334

Omar Khattab

Yi Ding -- prod/acc retweeted

Omar Khattab

@lateinteraction

May 28

Claude Code is finally an RLM (oct 2025), congrats to Anthropic :-)

ClaudeDevs

@ClaudeDevs

May 28

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.

491

79,029

Sid

Yi Ding -- prod/acc retweeted

Sid

@sidbid

May 28

Super excited to finally share Dynamic Workflows in Claude Code!! We built this a couple months ago, and it has slowly become a daily driver for a bunch of people at Anthropic. A few tips for getting the most out of it 🧵 x.com/ClaudeDevs/status/2060…

ClaudeDevs

@ClaudeDevs

May 28

170

2,462

491,681

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

It looks like workflows are more powerful thank I thought. Dynamic workflows allow workflows to be loops, etc. Which is probably why they're defined as JS rather than just a JSON DAG or something.

157

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

So I thought Claude Code workflows were orchestrated via some kind of JSON/YAML specification but turns out they're defined using JS!

427

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

The workflow viewer in Claude Code could really use some faster scrolling commands (4j/G/etc.) @amorriscode

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

Claude Code now helps you be a better prompter: Prompt engineering is dead. Long live prompt engineering.

152

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

Opus 4.8 does substantially better on HealthBench Professional (benchmark from OpenAI) than previous models. 5.5's corresponding score is 51.8* *Claude models in this graph are graded with Sonnet 4.6 whereas 5.5 is graded with 5.4/low.

167

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

npm install -g @anthropic-ai/claude-code@2.1.154

Yi Ding -- prod/acc

Yi Ding -- prod/acc

@yi_ding

May 28

Using Opus 4.8, seeing this new error message. Anybody have an idea why they made this change?

227