Shawn Simister

Shawn Simister

529 Photos and videos

Tweets

Pinned Tweet

Shawn Simister @narphorium

Mar 11

I've been thinking about why verifying AI agent output feels so much harder than writing the spec that produced it. That question led me to rethink where my attention actually belongs in the process, and eventually to build atelier.dev narphorium.com/blog/decision…

Documentation

A VS Code workshop where your thinking is grounded in your codebase

atelier.dev

1,311

Shawn Simister

Shawn Simister @narphorium

Jun 12

Every time an agent explores your code base it builds up its own mental model of how the code works, and then throws that model away when the session ends. Code walkthroughs turn that model it into a step-by-step guide to bring you up to speed on any part of the system.

0:26

794

more replies

Shawn Simister

Shawn Simister @narphorium

Jun 12

Rather than reading diffs line-by-line, you step through the agent's account of what it did and diff it against your own mental model. This is exactly the sort of high-level decision point which surfaces misunderstandings and gaps in the design. narphorium.com/blog/decision…

Optimizing for Decision Points | Shawn Simister

Designing Workflows for Human Judgment and Taste

narphorium.com

231

Shawn Simister

Shawn Simister @narphorium

Jun 12

Atelier walkthroughs can use the Chrome MCP in Claude Code to build UI walkthroughs with live screenshots of the app. Grounding the walkthrough in execution exposes gaps in the implementation that you wouldn’t catch with code review.

0:13

DaEun Choi

Shawn Simister retweeted

DaEun Choi @daeun_choi_

Jun 12

GenAI is good at producing a result. But early-stage design is not about a single result —it is about exploring possibilities. Our #DIS2026 paper “IdeaBlocks” won an Honorable Mention 🏆 We ask: how can designers express not only what to generate, but how to explore? (1/n)

1,373

Shawn Simister

Shawn Simister @narphorium

Jun 6

In 1985 Peter Naur argued that a program is more than just its source code. "Programming As Theory Building" explained how we build theories of the code which help us debug and refactor it but those theories rely on knowledge from outside of the code. pages.cs.wisc.edu/~remzi/Nau…

862

Shawn Simister

Shawn Simister @narphorium

Jun 6

I feel like over the years, we've sort of given up documenting our mental models of code and just accepted that everyone who reads the code builds their own model from scratch.

232

Shawn Simister

Shawn Simister @narphorium

Jun 6

Now, with vibe coding we don't even do that. The agent is often the only one with a mental model of the code, and it throws it away after each session.

167

Gilad Bracha

Shawn Simister retweeted

Gilad Bracha

@Gilad_Bracha

Jun 5

The future role of the software engineer is using AI to translate informal requirements into high level formal specs, and reviewing those. The AI implements the specs, and verifies against the formal spec using a theorem prover. The human is there so we can blame them when things go wrong; the human's job is to ensure the formal spec is correct; that is the code they review. If it seems wrong, they tell the AI and discuss. The human writes nothing but natural language.

2,784

Ethan Mollick

Shawn Simister retweeted

Ethan Mollick

@emollick

Jun 2

Big paper on AI coding agents using Github & other data The auto-complete tools (Copilot) led to 2.2x more code, local agents like original Claude Code led to 7.4x, & current remote coding agents 17.3x(!) But human bottlenecks in coding means actual releases "only" went up 30%

343

34,816

Omar Khattab

Shawn Simister retweeted

Omar Khattab

@lateinteraction

May 26

your novel idea, when you ask an llm to fill in the details

Drew Breunig

@dbreunig

May 26

We need a name for this, because Armin is putting his finger on a problem that’s everywhere: people running their writing through an LLM because they think it makes it clearer, when in actuality it sands off all the detail.

948

97,501

Jeremy Howard

Shawn Simister retweeted

Jeremy Howard

@jeremyphoward

May 22

We desperately need better ways of evaluating models. Something that shows how helpful they are at working hand-in-hand with humans to help them get stuff done in a cooperative/iterative way. The Claude models have consistently been better at this, and the market rewards that.

193

15,815

Chuanyang Jin

Shawn Simister retweeted

Chuanyang Jin

@chuanyang_jin

May 20

What are users thinking during their interactions with LLMs? We introduce ThoughtTrace — the first large-scale dataset that captures what users think during real-world human–AI conversations, not just what they type. → 10,174 thought annotations → 2,155 multi-turn conversations, 17,058 turns → 1,058 users → 20 LLMs These thoughts improve user behavior prediction ( 41.7%) and model alignment ( 25.6%). This opens a new paradigm of user-centric LLM research. Full information in the thread 🧶 Read our paper: arxiv.org/abs/2605.20087 Check our project website: thoughttrace-project.github.…

0:08

135

68,847

Jaimz

Shawn Simister retweeted

Jaimz @Jaimz_with_a_Z

May 17

I always found it hard to document large codebases in a way that made sense to me visually Thanks to @tldraw I built CodeCanvas, my own infinite canvas documentation tool for mapping out my thought process Excited to share some of my favorite features

1:07

Jaimz @Jaimz_with_a_Z

Mar 6

I love how customizable @tldraw is. Added a custom markdown editor using @tiptap_editor and i can't get enough of it. I'm having so much fun building whatever this is lol

1:42

604

Dan Shipper 📧

Shawn Simister retweeted

Dan Shipper 📧

@danshipper

May 9

as ai makes imitation cheaper and cheaper the value of using AI and your brain to make totally new things goes up

6,368

Shawn Simister

Shawn Simister @narphorium

May 9

I sketched this out a few years ago. The HTML vs Markdown debate is conflating substrate with information density. The real question is what kind of feedback an artifact actually invites. Hi-fi invites parameter critique. Lo-fi invites paradigm critique.

541

Shawn Simister

Shawn Simister @narphorium

May 10

So now AI has made the high-fidelity artifacts cheaper and easier to create but that doesn't change the rest of the equation. If anything, it makes it easier to fall into the trap of confusing high fidelity with high confidence.

131

Shawn Simister

Shawn Simister @narphorium

May 7

Prototyping and experimentation is not slop. Slop is when you don't care how it works. The whole point of prototyping is that you care deeply about finding what works narphorium.com/blog/top-down…

Top-Down vs. Bottom-Up Development | Shawn Simister

Why do AI tools work for some developers but not others?

narphorium.com

Mitchell Hashimoto

@mitchellh

May 7

AI slop is good, actually. Slop is what enables fast parallel experimentation. The etiquette and skill is understanding the boundaries of where slop exists and the extent to which it should be cleaned up and how. A few examples: I’m working on the internals of some system right now. The API and GUI of this thing is fully zero shame slop. It’s horrible. But it lets me focus on the core quality while shipping a usable piece of alpha quality software to testers (transparent about the slop frontend). Similarly, this system has plugins. We sent agents in Ralph loops overnight to generate dozens of plugins. The plugins are slop. The quality is bad. The plugin API/SDK is absolutely not done. But we can test a full GUI with a full plugin ecosystem. When we change the API, we can regenerate them all. The cost of change is just tokens, the velocity is incomparable to before. I built Terraform. We tested and shipped TF 0.1 with about 3 very weak providers. Because we ran out of time. Building was slow. And when we changed our SDK the cost was immense. Totally different today, 10 years later. Today, I would’ve slop generated 100 providers (again, with transparency and cleanup later, but just to prove it out). As an anti example, I would not PR this (without prior warning) to another project. I would not throw this onto customers without full review or transparency (as I’m already doing). I would not accept first pass slop. It’s almost never right. Slop is a tool. And like anything else it’s not blanket bad or good. The context is everything.

258

Sophia Xu

Shawn Simister retweeted

Sophia Xu

@thesophiaxu

May 6

Sharing a preview of an experimental tool I've been working on: A canvas-based IPython-compatible computational notebook exploring how "human-in-the-loop" looks like in an age of autonomous AI agents. More updates coming soon!

1:03

1,429

Shawn Simister

Shawn Simister @narphorium

Apr 30

One of the most famous power-user tools in the world is switching to an AI chat interface 😬 "This will be the new Terminal. This will be the primary way most interactions are happening..." wired.com/story/the-bloomber…

The Bloomberg Terminal Is Getting an AI Makeover, Like It or Not

WIRED spoke with Bloomberg’s chief technology officer about the big, chatbot-style changes coming to the iconic platform for traders.

wired.com