Drew Breunig

Drew Breunig

678 Photos and videos

Tweets

Drew Breunig

@dbreunig

Jun 12

6 month prediction: agentic coders will use giant models to scaffold, set up, and plan; small model swarms for implementation, review, and verification. Cursor, conductor, cognition, and other 3rd party harnesses will surge with this pattern, unless Fable-level models truly come down *dramatically* in price. Few things converging to drive this: - fable is legitimately great, but too expensive. - small models are getting really good in short bursts (go try Gemma E2B. It’s insanely good for a 2B model.) - ensembles are a real pattern with legs (even copilot(!) catches issues in fable PRs) - ai cost controls are arriving in the enterprise - Anthropic and OAI shifted enterprise costs to usage based - teams want to collaborate, and Anthropic and OAI aren’t focused on that pattern Again, Fable-tier models could get super cheap and fast and surprise us all. But the current crop of small models are *magnitudes* cheaper; a good harness could unlock their potential.

112

12,364

Drew Breunig

Drew Breunig

@dbreunig

20h

And another one: omnigent.ai/

Omnigent

An open-source meta-harness for building and running AI agents. Compose harnesses, govern them with policies, and share live sessions.

omnigent.ai

369

Drew Breunig

Drew Breunig

@dbreunig

Jun 13

One certainty everyone should takeaway from this week: you cannot rely on a single model.

Noah Ziems

@NoahZiems

Jun 9

Even setting token costs aside, I think it will become increasingly clear to companies of all sizes that relying on closed source models for ~anything important is a massive supply chain risk.

2,417

Drew Breunig

Drew Breunig

@dbreunig

Jun 12

So is everyone going to manage multiple versions of skills?

1,631

Drew Breunig

Drew Breunig

@dbreunig

Jun 12

Btw, if you provide a skill to users and want to ship a Fable version, check out the pattern in gskill:

Shangyin Tan @ShangyinT

Feb 20

GEPA for skills is here! Introducing gskill, an automated pipeline to learn agent skills with @gepa_ai. With learned skills, we boost Claude Code’s repository task resolution rate to near-perfect levels, while making it 47% faster. Here's how we did it:

1,507

Drew Breunig

Drew Breunig

@dbreunig

Jun 10

Tried Fable in a Claude Code mobile session (on vacation). It refactored a large app and updated some rotted tests. Did well, less guidance needed than Opus. But! GitHub Copilot(!) made three good comments on PR review, which Fable accepted. Make of that what you will.

2,066

Drew Breunig

Drew Breunig

@dbreunig

Jun 10

Text diffusion!

Google Gemma

@googlegemma

Jun 10

Meet DiffusionGemma! An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license. Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Here’s what’s new with DiffusionGemma: 👇

0:05

1,399

Drew Breunig

Drew Breunig

@dbreunig

Jun 10

Everyone building programs with AI needs to now: 1. Be able to swap out models, in short order. 2. Be able to verify, regularly, that your output isn’t silently changing. (You should have been doing this for any production system, with evals, but now there’s no excuse.)

2,107

Drew Breunig

Drew Breunig

@dbreunig

Jun 10

The imperfect and awkward ways Anthropic is using to control how their models are used (with Fable now, OpenClaw a bit ago) is a great example of the imprecision of natural language as an interface. The best model can’t differentiate a bio threat from an innocuous health or research question, even with specific training and focus. Natural language is imprecise by nature, and you need systems to hold it, and the output it produces, accountable. This issue won’t go away with better models.

1,855

Drew Breunig

Drew Breunig

@dbreunig

Jun 10

Malware authors are including spurious text about bio in an attempt to avoid Fable.

3,436

Mike Knoop

Drew Breunig retweeted

Mike Knoop

@mikeknoop

Jun 10

Measuring test time compute (eg dollars, tokens, flops, watts) is important because efficiency is the high-order bit in the definition of intelligence. It's what distinguishes brute force search from selective reasoning towards a solution.

Noam Brown

@polynoamial

Jun 9

x.com/i/article/205769422698…

3,289

Pope Leo XIV

Drew Breunig retweeted

Pope Leo XIV

@Pontifex

May 26

When it comes to decisions regarding economic flows and digital platforms, as well as the governance of data and algorithms, we cannot allow a handful of actors to dictate these processes on their own. Instead, we must build forms of cooperation that respect the various levels of the global community and make them jointly responsible for the common good.

115

588

4,623

225,879

Drew Breunig

Drew Breunig

@dbreunig

Jun 9

Start hoarding your traces now…

Nathan Lambert

@natolambert

Jun 9

Labs starting to pull up the ladders on the ability to diffuse AI was inevitable. Doing it without telling the user is misaligned.

3,097

Lance Martin

Drew Breunig retweeted

Lance Martin

@RLanceMartin

Jun 9

x.com/i/article/206438055391…

102

761

6,004

2,167,240

Drew Breunig

Drew Breunig

@dbreunig

Jun 9

How would you explain how an LLM works in 10 words or less?

4,449

Drew Breunig

Drew Breunig

@dbreunig

Jun 9

A good question I recently got, “If post-training comes after pre-training, when does training occur?”

551

Drew Breunig

Drew Breunig

@dbreunig

Jun 9

The amount of “do not hallucinate” in these iOS 27 system prompts does not inspire confidence. gist.github.com/samhenrigold…

iOS 27 system prompts

iOS 27 system prompts. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

3,159

Drew Breunig

Drew Breunig

@dbreunig

Jun 9

Come on, Apple, it’s 2026. “You are an expert food analysis AI specialized in analyzing food images to provide comprehensive nutritional insights.” 🫠

698

Drew Breunig

Drew Breunig

@dbreunig

Jun 7

Will just leave this here… dspy.ai/getting-started/prog…

Program, don't prompt - DSPy

The framework for programming—rather than prompting—language models.

dspy.ai

Peter Steinberger 🦞

@steipete

Jun 7

Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.

193

40,882

Drew Breunig

Drew Breunig

@dbreunig

Jun 7

x.com/dbreunig/status/206304…

Drew Breunig

@dbreunig

Jun 6

Replying to @dbreunig

Among AI enthusiasts, no one argues about the primacy of specs and tests when using agents to write code. But when you suggest they apply the same pattern to AI engineering and prompts, people push back. 🤔

4,950