Product-led CTO. Early stage startup operator, advisor & investor. Co-founder & CTO at ScultureAI.

Joined March 2007
604 Photos and videos
Leonidas Tsementzis retweeted
My heuristic is that any diff an agent generates over ~1500 lines is too big and is indicative that the problem needs to be decomposed. This is my general pattern now for feature work: 1. Try to implement the whole feature, loosely guided. I call this the "draw the owl" prompt in reference to the meme. Expect garbage, you're going to get garbage. 2. If the diff is less than 1500 lines, review it and iterate normally. If the diff is more than 1500 lines, prompt the agent to decompose the problem into atomic, incremental, reviewable tasks. Simultaneously, do this yourself. 3. Agents will very often make these tasks way too specific to the shape they solved. You need to massage it into the right general shape. Do that. 4. Kick off new agents to work on those incremental things (as parallelized as possible). Apply the same rules. 5. At a certain, point, repeat the "draw the owl" prompt. At some point, you will get beneath your review-ability threshold. This has been producing consistently high quality, maintainable, reviewable chunks of code that have a good handoff to either merge as-is or human refinement. And with the latest frontier models at xhigh thinking, these are all slow enough that you can usually have multiple going concurrently while you are actively reviewing others or working on your own tasks. HITL (human-in-the-loop) agents are still super important, especially for feature work. Features touch the human boundary in terms of UI, API, etc. And net new stuff can introduce pathologies in the architecture that violate desired invariants (these should be represented in specs or tests but we aren't perfect!). I know a lot of the leading edge agentic discourse is about "loops" and agents driving agents continuously. I do some of that (will report on that later). But, in terms of raw daily get-shit-done type of work, this is my most rewarding pattern at the moment.
95
227
3,627
196,274
Leonidas Tsementzis retweeted
> But, I don't have a concrete answer here, because unlike product and software development, I'm not directly building a commercializable product right now. […] When I walk the walk and learn more, I'll share more. This is why Mitchell is more respected than you, average X shitpoaster. He doesn’t spew uninformed dumb opinions and puts substance into his thoughts. Great post.
9
279
65,931
This strategy is indeed gold. I've been using an almost identical approach, and the impact on cost and performance has made a significant difference.
3
1
83
Leonidas Tsementzis retweeted
Last quarter I rolled out Microsoft Copilot to 4,000 employees. $30 per seat per month. $1.4 million annually. I called it "digital transformation." The board loved that phrase. They approved it in eleven minutes. No one asked what it would actually do. Including me. I told everyone it would "10x productivity." That's not a real number. But it sounds like one. HR asked how we'd measure the 10x. I said we'd "leverage analytics dashboards." They stopped asking. Three months later I checked the usage reports. 47 people had opened it. 12 had used it more than once. One of them was me. I used it to summarize an email I could have read in 30 seconds. It took 45 seconds. Plus the time it took to fix the hallucinations. But I called it a "pilot success." Success means the pilot didn't visibly fail. The CFO asked about ROI. I showed him a graph. The graph went up and to the right. It measured "AI enablement." I made that metric up. He nodded approvingly. We're "AI-enabled" now. I don't know what that means. But it's in our investor deck. A senior developer asked why we didn't use Claude or ChatGPT. I said we needed "enterprise-grade security." He asked what that meant. I said "compliance." He asked which compliance. I said "all of them." He looked skeptical. I scheduled him for a "career development conversation." He stopped asking questions. Microsoft sent a case study team. They wanted to feature us as a success story. I told them we "saved 40,000 hours." I calculated that number by multiplying employees by a number I made up. They didn't verify it. They never do. Now we're on Microsoft's website. "Global enterprise achieves 40,000 hours of productivity gains with Copilot." The CEO shared it on LinkedIn. He got 3,000 likes. He's never used Copilot. None of the executives have. We have an exemption. "Strategic focus requires minimal digital distraction." I wrote that policy. The licenses renew next month. I'm requesting an expansion. 5,000 more seats. We haven't used the first 4,000. But this time we'll "drive adoption." Adoption means mandatory training. Training means a 45-minute webinar no one watches. But completion will be tracked. Completion is a metric. Metrics go in dashboards. Dashboards go in board presentations. Board presentations get me promoted. I'll be SVP by Q3. I still don't know what Copilot does. But I know what it's for. It's for showing we're "investing in AI." Investment means spending. Spending means commitment. Commitment means we're serious about the future. The future is whatever I say it is. As long as the graph goes up and to the right.
5,114
25,547
171,294
25,916,450
Thank you, Microsoft Word, for the helpful suggestion.
1
74
At this point, randomly allocating items to a category might be better than using AI.
73
iPhone 17, inspired by… #AppleEvent
1
310
Leonidas Tsementzis retweeted
trump puts a tariff on joins
50
169
3,892
274,987
Leonidas Tsementzis retweeted
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
356
1,207
12,050
779,021
Apple’s new paper on the limits of AI reasoning is the most grounded research I’ve read in a while. Under complexity, LLMs don’t degrade. They collapse. If you’re building on top of LRMs/LLMs, give this a read and let me know what you think. leotsem.com/blog/the-illusio…
90
Leonidas Tsementzis retweeted
First time founders focus on product, second time founders focus on distribution, third time founders focus on memes
5
9
99
20,321
Birthday steak hits different
3
212
Stormy Worthing
281
Every developer platform company: We should enable developers to build applications for our platform. Microsoft:
2
266
Anyone using Azure OpenAI (gpt-4o) at scale and noticing intermittent long delays in response to the same inputs? Running some benchmarks and about 10% of the requests are like 4 times slower than the rest.
187
ERR_TOO_MUCH_MEAT
4
240
Just get out and look up
2
11
591
We now need a “Final Cut Camera” app for @DJIGlobal drones #AppleEvent
417
If you, like myself, were holding back on fully adopting @linear simply because of its fairly limited “documents” feature set, you should check this out.
2 May 2024
Introducing the next evolution of projects in Linear. Close the gap between planning and building.
9
483
Leonidas Tsementzis retweeted
founders adding “AI” to their landing pages and pitch decks

24
71
502
92,061