Co-founder @Calibre_Labs | Applied AI research & consulting | Agents, AI Evals | prev EVP @amplitude_HQ | VC @khoslaventures @sequoia | @stanford @iitbombay

Joined May 2009
516 Photos and videos
Pinned Tweet

3
1
21
12,640
Jun 12
There are some substackers/podcasters I liked following whose content has become claudeslop .. the substance and insight might be there but the writing is just so much worse than their previous bests and it makes me sad. I feel an odd sense of loss and dismay as someone who loves reading well written thoughts.
1
1
116
Every time I see loops I read it as more cowbell
2
134
Waking up and choosing violence
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
4
1,442
Love this! The new masterclass of our time for software work should start with vocabulary
To get good animations from an AI you need to get good at telling it what you want: - "stagger this list of items" - "make this animation direction-aware" - "spacial consistency", "crossfade", "layout animation", I made a motion vocabulary for this: animations.dev/vocabulary
1
246
May 12
Fabulous post on the PM craft, couldn’t agree more. Especially the fact that AI as a source of leverage has shown us how mediocre most PM work has become. I see so many who have completely lost touch with their product’s actual full user experience, from facing friction in the moment of consideration to yelling at AI bots that don’t work.
1
2
714
Sandhya retweeted

173
520
4,476
2,730,105
Just realized that the last TBPN post to pop up in my feed was their acquisition, they might it as well have retired if you ask my feed
1
1
503
Apr 24
building AI evals right now feels like CI/CD in 2012.. everyone knows they need it. everyone agrees on what great can be in theory but almost no one actually does it... and the tooling is just barely past the vibes stage. the companies that figure out eval pipelines (converting production issues into real tests) will ship better AI products 10x faster than the ones running on optimism. it also take a certain mindset to do this well... most teams just don't want to admit their baby could look prettier .. their evals test for life rather than excellence
1
381
Apr 22
chat am i the only one (and no this wasn't a computer use session) getting tired of the unending bugs
3
362
Apr 22
When you have the kind of PMF Anthropic has, the “growth” team’s job is to not f*ck it up. That’s it.
1
4
1,133
Apr 21
Easily the most ill considered AB test I have ever experienced - removing Claude Code from the Pro subscription for new users!!!!
Replying to @simonw
My hunch for now is that this was an ill-considered test which they didn't anticipate would be instantly spotted and cause (justified) uproar - here's hoping they decide that the "test" isn't a good idea! If they do go ahead I expect OpenAI Codex to catch Claude Code very fast
2
3
732
Apr 21
Yes I am selling an addictive drug but will NOT give you a taste sir
96
Apr 21
Increasingly hearing (and feeling) the word “addiction” being used by ai-pilled developers when describing our daily work.. mistaking output for progress and impact the way you mistake euphoria for happiness. I want my moments of friction to function as thinking breaks - forcing me to question whether something should exist rather than shipping. Building /= Shipping. The latter is a means to the former end. H/t @mitsuhiko
234
Apr 21
Friction is also how you learn and build judgement! Loved this talk. I work with teams on improving quality with evals and monitoring and we always see engineers eager to automate the process before they understand it. You can’t express taste and judgement without first putting in the manual work and creating the ground truth that cloud agents can use to automate your systems. No pain, no gain.
🆕The Friction is Your Judgment youtube.com/watch?v=_Zcw_sVF… @mitsuhiko (creator of Flask) and Cristina Poncela Cubeiro (AI-native Engineer) break down what goes WRONG when you "ship without friction" — listing many areas of agentic coding where you are enticed to turn your brain OFF when you most need to turn it ON.
1
472
Sandhya retweeted
I upgraded my Claude token counter tool to compare different models and Opus 4.7 does appear to use 1.46x times the tokens for text and up to 3x the tokens for images - it's priced the same as Opus 4.6 on a per-token basis so this is actually a pretty big price bump
112
142
1,557
146,293
Apr 18
The irony. Just as creating web apps became trivial, user preferernce switched to text files for their agents. That probably covers 80% of apps getting vibecoded.
1
7
922
Apr 16
Copy paste /go
Replying to @bcherny
6/ Give Claude a way to verify its work Finally, make sure Claude has a way to verify its work. This has always been a way to 2-3x what you get out of Claude, and with 4.7 it's more important than ever. Verification looks different depending on the task. For backend work, make sure Claude knows how to start up your server/service to test it end to end; for frontend work, use the Claude Chromium extension to give Claude a way to control your browser; for desktop apps, use computer use. Personally, many of my prompts these days look like "Claude do blah blah /go". /go is a skill that has Claude 1. Test itself end to end using bash, browser, or computer use 2. Run the /simplify skill 3. Put up a PR For long running work, verification is important because that way when you come back to a task, you know the code works.
501
Apr 16
Agent experience is the new frontier for saas and infra companies. Most need to rebuild their technical foundations for it. This trend has already been clear for SEO and Dev tools the past 12 months. Now it's all software: "The AI bear case for Snowflake revolves around differences in human vs. agent preferences for accessing data and the continued march of infrastructure that prices to one paradigm becoming obsolete as the world advances."
Every day for the next long while, I'm going to tear down a new public software company and highlight the AI risks/opportunities around it- products launched to date, top startups, key quotes from earnings calls, etc. Day eighteen: Snowflake $SNOW Peak share price: $392.15 (Nov 19, 2021) Share price today: $121.11 (-69%) EV today: $39.8bn ARR today: $5.1bn ( 30% Y/y) NRR: 125% EV/ARR: 7.8x GAAP Operating Margin: -25% (!!) EV/Run-rate GAAP EBIT: N/A Headcount: 9060 ( 16% Y/y) What Snowflake does: Snowflake is the leading cloud data warehouse focused on helping companies store, manage and query tabular business data using SQL. A significant share of the world's largest enterprises have opted to pool their critical data onto/around Snowflake to create a data warehouse of record to power everything from observability to analytics to data applications. The key innovation powering Snowflake's rise was the separation of compute and storage as concepts, allowing users to apply elastic compute against fixed storage, reducing analytical queries that used to take hours to seconds. Like others in the space, Snowflake has expanded into other adjacent areas like python, ETL, BI, etc. AI bear case: The AI bear case for Snowflake revolves around differences in human vs. agent preferences for accessing data and the continued march of infrastructure that prices to one paradigm becoming obsolete as the world advances. In particular, while Snowflake's query engine works very well at human speeds (loading a dashboard, running a complex SQL query) upstarts like @ClickHouseDB and @motherduck argue that agents have very different preferences and prefer lightning fast queries that would be very expensive on Snowflake. In short, the bear case on Snowflake is that analytical queries will be run by agents in the future, and Snowflake's platform has an architectural innovator's dilemma in serving those use cases. AI bull case: The reality is, thousands of the world's largest companies have invested huge effort in standardizing/centralizing on Snowflake. The battle to be the system of record for aggregated tabular business data is already over at these companies- it will be Snowflake for the foreseeable future. The implication is that agents are actually a huge tailwind for Snowflake- they will need to access business data to operate, to derive insights, to understand context, etc. and Snowflake's business model has the clear advantage of letting it monetize those queries as if they were coming from a human. AI traction: It is hard for Snowflake to know exactly what share of its revenue comes from AI-driven queries, but it did say this on the Q4 call: "This quarter, we delivered the largest sequential increase in accounts using AI, bringing the total to more than 9,100 accounts." Beyond that, net retention ticked up last year to 125%, very impressive at this scale. Adjacent AI-native startup summary: Databricks, albeit not AI-native, is the juggernaut to watch here, with a reported 15,000 employees up 34% Y/y. Clickhouse - 536 employees, 86% y/y Motherduck - 133 employees, 46% Y/y Management Quotes: "And in just 3 months, Snowflake Intelligence has scaled from a nascent offering to an essential capability for over 2,500 accounts, almost doubling quarter-over-quarter." "Our deepened partnership with Anthropic is already helping customers like Intercom see significant impact." "And Matt, just to emphasize that point, just in fourth quarter, we saw a lot of benefit with AI that we had a small reduction in force and about 200 people in the company were impacted. So if you look at our fourth quarter net adds on a headcount basis, we only added 37 people. So AI has really changed the framework for investing in growth. It's no longer tied to headcount." "So we will be launching features like a per user cap on top of Snowflake Intelligence, so they can feel like there is a clear upper limit to how much they can get charged with an agent. We think models like this that are consumption-based with clear user caps and account caps offer the best of both worlds, which is consumption pricing with price predictability." "Yes. Super quickly, like partners, customers and our internal field are all incredibly excited about the results we're seeing with Cortex Code. The original value prop of Snowflake, which is change what's possible in terms of ease of use, it's just gone like 10x with Cortex Code. We showcased a number of instances where people are building pipelines faster, transformation faster, insights faster. And I think we're only at the beginning of what is possible." Commentary: Though the balance of evidence (and certainly my customer work) suggests that Snowflake should be a beneficiary of AI, it is certainly striking that the business impact seems to have been muted thus far. All of the ingredients are there- consumption-based pricing, AI lowering the barriers for humans to ask questions of data (aka AI-generated SQL), and data as a key foundational layer to agents. My suspicion is that some of this disappointment to-date may come from Snowflake's lack of alignment to the use-case where AI is working the best today (i.e. code). Analytical queries may simply be slower/harder to get right- but it certainly seems likely that in a future where agents accelerate the amount of knowledge work done in the enterprise, Snowflake's core business should see a meaningful tailwind. Once that question is answered, the burning question will be whether agent adoption presages an architectural shift towards data warehouses with a more AI-native architecture. My gut is that this will happen at some scale but won't create a wholesale shift and lead to a data warehouse replacement cycle. It will certainly be interesting to watch, though!
1
9
2,497
Apr 15
Watching the Cowork wave flood into GTM and Ops teams in big companies and dear lord, it has never been more valuable to understand how software works. Creative system thinkers are on a high. Folks without an intuition for software are building skills and plugins the wrong way - not able to make any edge cases work, building fragile, slow token-expensive solutions, not able to test their own skills.. My takeaway from this is that either: 1) AGI gets so good it will correct people and inform them of better approaches to achieve their goals or 2) There will always be a massive market for vertical focused agent platforms that abstract the underlying systems and best practices so people don't need to know how software works
13
3
78
9,518
Apr 14
we were always headed here .. the best model lab gets to reinvent the IDE
Today is a big day! We're launching a ~ new ~ version of Claude Code in the desktop app. It's been redesigned from the ground up for parallel work and is a lot faster. It's been my main way to use Claude Code for the last few weeks.
1
2
657