MD @z47_vc | Agenting agents | Building @DeVC_Global | past @DisneyPlusHS @Housing | 2X founder 3X CXO | Deep Learning Geek

Joined November 2009
342 Photos and videos
We have entered the middle game now
THE TOKEN HANGOVER @matanSF (Matan Grinberg), CEO and co-founder of @FactoryAI , interviewed by @HarryStebbings (@20vcFund ) This is a special for me since I've been an investor in @FactoryAI since their seed round, and think Matan is a very very special founder. Summary: Grinberg argues the next 24 months in enterprise AI are a resource-allocation problem: tokens, dollars, and people. Most CIOs are now waking up to bills they cannot justify. The fix is to spend frontier tokens only on the 10-20% of work that requires planning intelligence, run the other 80-90% on open models, and rebuild teams around load-bearing polymaths who own business outcomes. The single-frontier-monopoly fear is fading: four roughly-equivalent labs is the emerging reality, which puts pricing power back in the application layer. 1. The Token Hangover. Enterprise AI adoption ran through three phases this year: boards yelling at CEOs about AI strategy, "token maxing" with AI usage written into perf reviews, and now the morning-after bill. One CIO Grinberg spoke to was spending hundreds of thousands of dollars a month on engineers asking Opus 4.8 things like "how's it going" and "what are my macros from lunch." The frontier model became the default surface for every question, no matter how trivial. Phase 3 is the moment routing matters: every call to a frontier model needs to earn its price. 2. Resource Allocation Is the Job. For the next 24 months every C-suite is solving the same problem: how to allocate dollars, tokens, and headcount against business outcomes. Engineering teams used to be judged by features shipped per quarter, a metric with no link to revenue, market share, or retention. A logistics company adding more engineers to ship more features was always solving the wrong problem; AI made the misallocation visible. Tie every person's work to the metric that actually moves the business, then re-allocate. 3. Load-Bearing Individuals. The "10x engineer" frame measures lines of code, the wrong unit. Grinberg's unit is the load-bearing individual: the person whose absence breaks something. With AI the load-bearing few compound roughly 10,000%; the others get close to nothing, so any org enforcing one token-spend-per-engineer number is painting with too wide a brush. Average token spend per engineer will land on the same order of magnitude as their salary within three years, with a wildly bimodal distribution. 4. Frontier for Decisions Only. 80-90% of software development tasks can run on open models; the remaining 10-20% is planning, where the frontier still wins. This mirrors how human orgs work: leadership is a tiny share of total hours but decides the company's fate. The ego trap is engineers assuming their work is too important for an open model. The router decides better than the engineer, and the cost curve falls only if you wire the routing. 5. The Kirkland Mistake. Kirkland & Ellis announced a $500M, five-year internal AI build, which Grinberg reads as validation for Harvey rather than a threat. Building AI is not a law firm's core competency, and Kirkland's spend will teach them how hard it is. The general rule: just because you can build it does not mean you should, and the discipline is naming the few things you and your team own end-to-end. Outsource everything else, even when you technically know how to do it yourself. 6. Model-App Separation. When the model provider also sells the app, the incentives split: an API business wants you to spend more tokens. A healthy market keeps the application layer independent, so model providers compete on price, speed, and quality every week. Enterprises do not want to vendor-lock again; every CIO carries scars from the cloud era's three-year discount-then-jack-the-price trap. The application layer survives precisely because it forces that competition. 7. Sales as Product. Name a legendary company with a weak sales or marketing team. You can't. The Silicon Valley fallacy that research sits at the top and sales is "dirty work" produces companies that win the gold rush and then collapse when gravity returns. At Factory, engineers and salespeople sit intermixed; when sales closes, engineering says "we closed"; when engineering ships, sales says "we shipped." Atrophied sales muscles will not regrow once enterprise buyers stop saying yes to everything. 8. Polymath Era. Da Vinci, Newton, Euler could be polymaths because their fields were shallow. By the 2010s a theoretical physicist needed 50 years to reach the frontier before contributing anything new. AI collapses that catch-up time, so one person can push forward developer marketing, token-caching infrastructure, and solution engineering at once. The engineer of the future is a GM who owns marketing copy, product metrics, and sales enablement. 9. Build the Factory. Factory's name is literal: engineers in the next era design the assembly line that produces software. The DevX investments that used to scale linearly with headcount (good docs, CI/CD, linters, pre-commit hooks) now scale with the number of agents you run, which is 10x or 100x larger. Every dollar spent making agents production-ready compounds against thousands of PRs a week. Humans move up the stack, from writing code to designing the system that writes code. 10. Seal Team Six. Mandating beds in the office is a hiring failure dressed up as commitment. Grinberg's image: a basketball game judged by who sweat the most, when the scoreboard is what counts. Factory bought eight sleeps for all 30 team members at the time, because recovery is where the gains come from when work requires every ounce of brain power. If your load-bearing engineer can do their best work on two hours of sleep, they were not doing load-bearing work in the first place. 11. Four Frontier Labs. Grinberg's biggest mind-change this year: a single dominant model is unlikely, and four roughly-equivalent frontier providers is the more probable steady state. That outcome is the win for humanity. A one-lab monopoly was the dangerous scenario, and four equivalent labs is also the structural bull case for the application layer because it forces real ongoing price competition. Every CIO Grinberg meets has already decided not to throw their lot in with a single provider. 12. Dario's Self-Serving Doom. "AI will take your jobs" was the pitch that helped raise hundreds of billions, and Grinberg thinks it damaged public psychology and fed the slow-AI lobby. Watch the rhetoric flip at IPO: humans will suddenly become important again, because humans are the ones buying the stock. Founders who never needed to raise that money, like Zuckerberg and Hassabis, never made that argument. Incentives drive the labor-displacement rhetoric more than philosophy does.
1
102
aacash.eth - Aakash Kumar retweeted
Looking for a team of 10-12 cracked researchers & engineers to build a new AI Lab in India. Funding and compute secured. DMs open.
302
439
3,924
237,666
aacash.eth - Aakash Kumar retweeted

4
14
216
41,072
aacash.eth - Aakash Kumar retweeted
India consumer internet is a misunderstood market among VCs. The “it’s saturated” crowd is running US mental models on an Indian reality. In the US, new consumers entering the economy is a rounding error. In India, it is the main event and each new cohort spends more than the last. That math means: categories growing 25-30% YoY, where a new entrant only needs 20% of new growth to matter. Not zero-sum Now add the AI variable. Customer acquisition, ops, personalisation, the cost curves on all of it are flattening. A founder with Claude and $2M can build what required a war room and $20M three years ago. India will be its own consumer story - hundreds of category-defining companies will bloom, each riding a fresh consumption wave. It’s still day 1.
12
6
110
9,422
Two VC analyst kids in a group chat are talking about new architectures for models and neural nets! Brace yourself! Might just be that desi VC kids are plotting to cryptofy these new tokens now! #run
2
1
34
7,303
Education and health are both the final frontiers for tech to capture two mega markets. And likely going to be the first breakouts to emerge in consumer AI. @RajatAgarwal167 and I have held that belief for a long time. We have had the privilege of partnering with cos like Supernova who are designing unique CXs for consumers. If you are building to reimagine education, we are an always keen to partner on your journey. Let’s help a billion learn better and more importantly the right way! 🚀
May 29
AI is making kids dumber. It should be making them geniuses. Introducing Koji, the first AI tutor that gets kids to actually think. 👇
1
5
1,697
Demos, Not Memos! Next 2 days (May 29-30) come witness LIVE what India’s AI community is building @DeVC_Global is hosting the exclusive track at Mumbai Tech Week for founders, builders and investors to connect, learn and scale together! With our friends at @OpenAI @mumbai_tech_ and @ActivateSignal we have curated 45 tactical sessions from founders and operators who have deployed AI agents at scale, shipped AI products to millions of users and restructured entire organisations around AI native operating models.
1
14
1,234
In 2026, growth oriented companies are either born AI native or transforming into one! We interacted with 250 leaders at some of the fastest growing cos in India to pick a set of AI deployment showcases that represent the full spectrum of India Inc: - 9 sectors spanning healthcare, BFSI, education, legal, media and entertainment, agriculture and more - founders from earliest stage to pre IPO to public listed ones A line up rare to put together and if you are a builder or an investor, you have likely never experienced an event quite like this!
1
1
599
Huge shout out to the team that is the force behind this. The stage is set! Let’s go! 🚀 @VishnuVenu811 @Nyrika2 @Rahul_J_Mathur
1
3
516
Mediocrity becomes a permanent ailment , if not addressed by the time you step out of your teens. “Nurture the natural” is only truth about behaviour shaping that holds. Thankfully, it’s not a communicable disease….
1
2
753
..especially watch out for the “top decile mediocre coasters” these are the ones who have a brethren of their own; and they collectively survive and selectively thrive in the wider ecosystem, and silently (most stealthily) weaken your orgs and systems…
1
1
265
.. and the only way to weed this one out is by always measuring people across the board on relentlessness in your teams: the only litmus test for spitting out the mediocre. No winners ever in life, no matter how gifted, have ever been anything but relentless!

ALT Just Do It Nike GIF

1
151
aacash.eth - Aakash Kumar retweeted
Today we’re announcing our $113M Series B led by @CapitalGVC. Over the last 6 months, weekly volume on OpenRouter grew from 5T to 25T tokens as AI rapidly shifts from experimentation into production. We’re excited for what comes next.
154
148
2,610
394,975
Having tinkered with building a self improving harness (NLAH) a few weeks back with evals and worktrees, can vouch for how powerful and understated this approach is. Finally someone put deeper thought towards it. Must read and follow!
Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness. SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them. A few things I learned that you should consider too. 1. The validation gate is the only thing that matters in a self-editing loop. Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop. 2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot. Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size. 3. Compactness wins. Median final skill: ~920 tokens. Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't. 4. The harness is becoming less important; the skill is becoming more important. A Codex-trained skill ported into Claude Code hit 59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that produced it. 5. Frozen model trained context is the practical adaptation. GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models. 6. Verification is the bottleneck. Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage. There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7, gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK: - Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it. - Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is. Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured. The fast/slow split they describe already lives implicitly in the digital-brain-skill repo: - voice-guide and tone-of-voice.md are slow-state (rarely touched) - posts.jsonl and bookmarks.jsonl are fast-state What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing. If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: arxiv.org/pdf/2605.23904
2
3
680
aacash.eth - Aakash Kumar retweeted
May 25

47
109
1,128
269,168
aacash.eth - Aakash Kumar retweeted
for those of you interesting in mastering the art of quantitative communication once you are done with Tufte, you may advance to Bertin billions of dollars have been reallocated based on the lessons of this text
14
117
1,469
61,561
And if you haven’t read this book n times, don’t feign system design depth. Please.
if you run an ai lab, pls ensure your team has read this before putting any charts out into the world
486

ALT Mando Way This Is The Way GIF

Today we reduced headcount by 22%. The business is the strongest it's ever been. So I think it's important to be direct about what I'm seeing and why. First, I made this decision and I own it. I did it because the way to operate at the highest level of productivity is changing, and to win the future, ClickUp needs to change with it. Second, this wasn't about cutting costs. Most savings from this change will flow directly back into the people who stay. We'll be introducing million-dollar salary bands. If you create outsized impact using AI, you'll be paid outside of traditional bands. Most importantly, I have the deepest gratitude for those affected. We're doing this from a position of strength specifically so we can take care of people properly. Everyone affected receives a package aimed at honoring their contributions and easing the transition. I only see two options: wait for this to play out gradually in the market or be honest about what I'm seeing and act proactively. THE 100X ORGANIZATION The primary change is that we're restructuring around what I call 100x org. The goal is 100x output. The roles required to build at the highest level are fundamentally different than they were a year ago. Incremental improvements to existing systems won't get us there. We need new ones. That means creating enough disruption to rebuild rather than iterate on what's already broken. The common narrative is that AI makes everyone more productive. It doesn't. Many of the workflows of today, if left unchanged, create bottlenecks in AI systems. These roles will evolve. But waiting for that to happen naturally means falling behind now. The 100x org is actually heavily dependent on people - infinitely more than today. This is only possible with 10x people that have embraced and adopted new ways of working. THE BUILDERS, AGENT MANAGERS, AND FRONT-LINERS — THE BUILDERS: 10X ENGINEERS I don't think most companies have internalized what's actually happening with AI in engineering. The common narrative is that AI makes all engineers more productive. That may be true in isolation, but at an organization level - that is the farthest thing from reality. Here's what we've validated recently at ClickUp: the great engineers, the ones who can orchestrate, architect, and review, are becoming 100x engineers. They're not writing code. They're directing agents that write code. The skill is judgment. AI makes the best engineers wildly more productive, and everyone else using AI slows these engineers down. Think about it - the bottlenecks are (1) orchestration - telling AI what to do, and (2) reviewing - what AI did. Everything is leapfrogged and no longer needed. So who do you want orchestrating and reviewing code? And how do you want your best engineers to spend their time? If your best engineers are spending time reviewing other people's code, then this is inherently an inefficient bottleneck. These engineers can review their agent's code much faster than reviewing human code. The new world is about enabling your 10x engineers to become 100x. The wrong strategy is to push every engineer to use infinite tokens. Companies doing this are celebrating 500% more pull requests. But customer outcomes don't match the volume of code being generated. I call this the great reckoning of AI coding, and every company will face this soon if not already. More code is just another bottleneck to the best engineers, and ultimately to your company's impact as well. — THE BUILDERS: 10X PRODUCT MANAGERS Product management and design roles are merging. Designers that have customer focus, become more like product managers. And product managers that have intuition for UX become more like designers. The bottleneck of user research is gone. It takes us just one mention of an agent to kickoff research and analyze results. The bottleneck of product <> design iteration is also gone. The product builder iterates on their own, along with agents and skills that ensure alignment with quality and strategy. Also controversial today - I believe that the wrong strategy is to have your PMs shipping code - that just introduces another bottleneck that the best engineers will waste their time on. To be clear, PMs should be coding but they should do this in a playground to iterate, validate, and scope. That code should not go to production. Everything outside of managing systems, orchestrating AI, and reviewing output becomes a bottleneck. That's why the other roles that are critical along with these are the systems managers (to reduce bottlenecks) along with a bottleneck you can't replace - customer meeting time. — THE SYSTEM MANAGERS Ironically, the people that automate their jobs with AI will always have a job. They become owners of the AI systems - agent managers. We have many examples of these people at ClickUp. The underlying systems in which we operate are absolutely critical to get right. I think most companies are delusional to think they can iterate on existing systems and compete in this new world. You must create enough disruption so that old systems are deprecated entirely. If there's any definition for 'AI native' that's what it is. — THE FRONT-LINERS In a world that will become saturated with AI communication, the human touch will matter more than anything to customers. This is a bottleneck that you shouldn't replace - even when agents are high enough quality to do video meetings. One-on-one meeting time with customers is something that shouldn't be automated. The systems around the meetings should be - so that front-liners spend nearly 100% of their time with customers. REWARDING 100X IMPACT In a world where companies are able to do so much more with less, where does that excess money go? In our case, much of the savings in this new operating model will flow directly back to those that enabled it. We must reward people that create productivity accordingly. This aligns incentives on both sides. Plus, in a world where your best people create 100x impact, you can't afford to lose them. You should aim to retain these employees for decades. The context they have and their ability to efficiently orchestrate and review will be nearly impossible to replace. Compensation bands of today should be thrown out the door. We're introducing $1 million cash/year salary bands with a path available to nearly everyone in the company if they produce 100x impact by creating or managing AI systems. THE FUTURE Nearly every company will make changes like these. The ones that do it proactively will define what comes next. The future is not fewer people. It's different work, new roles, and better rewards for those who embrace it. We're already seeing entirely new roles emerge, like Agent Managers, that didn't exist a year ago. ClickUp is positioning to lead this shift, not just internally, but for our customers too. I've never been more certain about where we're headed.
186
aacash.eth - Aakash Kumar retweeted
@getscapia just raised $63M, led by @generalcatalyst. @peakxvpartners and @z47_vc doubled down. I want to say what this round actually means to us. This is not just validation. This is responsibility. The work today is the same as yesterday. Build a travel ecosystem that a generation of Indians actually wants. Earn trust one transaction at a time. Stay rooted while we scale. Delight our customers. Our customer base grew 7x last year, with India showing up in ways that quietly excite me the most. To every Scapia customer, thank you for picking us. You are the motivation behind every product we ship and every small fix. To the Scapia team, you are turning belief into a product people reach for every day. Thank you, team. Long innings ahead. We are still padding up! To the General Catalyst team, welcome. To Peak XV Partners and Z47 and our existing investors, @ElevCap , 3STATE Ventures and Tanglin Venture Partners, thank you for backing us before any of this was obvious. And on AI, we are already native. Every single person at Scapia is doubling down on it, every single day. It is how we hire, how we build, how we decide. This round lets us go further. Deeper upskilling for the team we have. Sharper hiring for people who build with AI by default. AI-native products for a generation that expects nothing less. The best Scapia is the one we have not yet built. Striving to improve. Forever.
16
10
129
14,517