Ben Lesh

Ben Lesh

8,133 Photos and videos

Tweets

Jay Phelps retweeted

Ben Lesh

@BenLesh

May 26

1/ At @ThisDotLabs, I built a Codex Plugin, called Build Web Data Visualization with DevEx Lead @coreyching from OpenAI. With this plugin, Codex uses image gen to design the app and then implements it really well. See the examples below. Claude can’t do this!🧵

7,709

Jay Phelps

Jay Phelps

@jayphelps

May 23

My current issue with AI for coding: I ask it to check for edge cases, and it does a good job at findind [most] of the ones I didn't think of initially.

1,309

Jay Phelps

Jay Phelps

@jayphelps

May 23

Why is that a problem? That means that my PR's contain edge case code that is harder to reason than the more simple PR I would otherwise initially introduce. Which means PR reviews become a lot harder to get past. Humans are used to this work flow: initial PR of a feature introduces simple code. Over time edge cases are discovered and fixed, but incrementally added over time, so humans are better at accepting "well this sucks but yeah it fixes an edge case" vs. a single larger PR that addresses lots of edge cases all upfront.

727

Jay Phelps

Jay Phelps

@jayphelps

May 23

Before AI, code "grows" naturally, incrementally over time, in complexity to handle edge cases. After, it can (if you care to ask AI) it can start much more complex. And humans aren't ready for it yet.

468

Jay Phelps

Jay Phelps

@jayphelps

May 11

Even AI doesn't know how to react to this message

541

Jay Phelps

Jay Phelps

@jayphelps

May 11

AI has unlocked things unthinkable before. Bun was ported from Zig to Rust mostly in about a week.

Jarred Sumner

@jarredsumner

May 9

Replying to @jarredsumner

there will be a blog post about this. on what this means for bun, benchmarks, memory usage, maintainability going forward, and also the literal process of doing this (it wasn’t just “claude, rewrite bun in rust. make no mistakes”) this is a 960,000 LOC rewrite, the code truly works, passing the test suite on Linux and soon other platforms. e2e I started working on this 6 days ago. this would’ve been a massive amount of work by hand.

2,718

Jay Phelps

Jay Phelps

@jayphelps

May 11

Of course having robust test coverage was crucial, and it remains to be seen if this is one of those cases where 99.9% of the way is "easy" but the remaining 0.1% takes forever or causes the initiative to fail. x.com/jarredsumner/status/20…

Jarred Sumner

@jarredsumner

May 11

Replying to @jarredsumner

I have pretty high confidence in it at this point. It passes Bun’s test suite on Linux x64 arm64 glibc musl, Windows x64 & arm64, and macOS x64 & arm64. It likely closes about 200 github issues. Still refactoring & simplifying. Still need to write the blog post.

587

Jay Phelps

Jay Phelps

@jayphelps

May 7

I'm never reviewing another PR again. Will just file a claim.

nico laqua

@nico_laqua

May 4

Today, we're excited to launch AI Coverage (insurance for when your AI messes up). Insurance was built for risks that have existed for decades. AI is creating a new category very quickly.

0:26

1,491

Ben Lesh

Jay Phelps retweeted

Ben Lesh

@BenLesh

Apr 29

Really proud to show off the 3D dungeon game I built for @OpenAI using Codex, GPT 5.5 (and 5.4) and new plugins/skills I worked on. This game was created in about 2 work days, including art and audio, etc. This doesn't even show all of the dev tools that are built in.

corey.ching

@coreyching

Apr 23

GPT-5.5 is here - two demos I wanted to share and are featured in the launch post. One is a playable 3D dungeon arena prototype built with Codex and GPT models — from game architecture and combat systems to HUD feedback, environment textures, and character dialogue. The other is an interactive visualization of live USGS earthquake data, combining a global map, timeline, and depth profile to explore recent seismic activity. Fun launch, checkout out more in our official blog ➙

0:39

0:22

8,374

Jay Phelps

Jay Phelps

@jayphelps

Apr 30

Hey. April ends today. You know what that means?

1,514

Jay Phelps

Jay Phelps

@jayphelps

Apr 26

Claude Opus 4.7 seems to be pretty good at writing W3C-style technical specifications for my use cases. It's not great at writing "proposals" for co-workers to review (lots of rambling, extraneous fluff, disjointed) but tech specs it's pretty good.

797

Jay Phelps

Jay Phelps

@jayphelps

Apr 26

It's interesting how I thought they'd be effectively the same skill, but it seems it's not. Maybe Claude wasn't really trained on high quality big-company proposal memos, but was trained on specs like from the W3C.

301

Jay Phelps

Jay Phelps

@jayphelps

Apr 26

I thought Claude often not working was just me.

823

Jay Phelps

Jay Phelps

@jayphelps

Apr 20

I definitely can sympathize with this sentiment, but not agree with it, especially lately. It's definitely far from perfect still but it helps me solve hard problems. Note that I didn't say it solves them, it helps ME solve them. A great sounding board, but I have to debate it.

patagucci perf papi

@kenwheeler

Apr 19

after a great deal of trial and error, diverse projects and tasks, and evaluating all new hotness, i’ve come to the conclusion that the best way of agentic coding is keeping it ridiculously simple. and anyone saying otherwise is selling something, grifting or dumb. or all 3.

2,275

more replies

Jay Phelps

Jay Phelps

@jayphelps

Apr 20

But my original point wasn't about that. It was about having a sounding board for complex ideas and API design. Across numerous clients. In rarely charted waters. It's not that I trust it blindly, it's that it's useful to have some to debate with before I waste people's time

303

Jay Phelps

Jay Phelps

@jayphelps

Apr 20

Nuances aside I'll put it this way: I'm definitely notably more productive with it than I was in September of last year. It's worth the costs for me, at least. And I truly expect it to only get more pronounced. Full on AI-coworker is the inevitable future not too far away.

266