Yash Patil

Yash Patil

1 Photos and videos

Tweets

Lan Jiang retweeted

Yash Patil

@ypatil125

18h

When we started Applied Compute this was our thesis in a nutshell. "Companies need to turn their workflows, domain knowledge, and accumulated judgment into AI systems that improve with each use. Private evals should capture whether a model is actually improving against outcomes that matter to the business (not just external benchmarks!). Private reinforcement learning environments should let models grow stronger on real traces from inside the organization. Its knowledge base makes institutional memory queryable and use of tokens more efficient. This loop becomes the new IP of the firm. I think of it as a hill climbing machine. And unlike most assets, it compounds. Every improved workflow generates better training signal, which accelerates the accumulation of tacit knowledge unique to the firm. The companies that build this early will have an advantage that is hard to replicate, regardless of any new individual model capability."

Satya Nadella

@satyanadella

21h

x.com/i/article/206558289479…

261

64,845

Josh Wolfe

Lan Jiang retweeted

Josh Wolfe

@wolfejosh

16h

Lux family co @appliedcompute brilliant founder @ypatil125

Yash Patil

@ypatil125

18h

23,337

Linden Li

Lan Jiang retweeted

Linden Li

@lindensli

17h

Microsoft could have easily chosen to define the frontier as dedicated access to GPT and Claude on Azure. AI Foundry had a durable business serving these models within the Microsoft ecosystem. Two models could have dominated market share of all tokens. In this world, we would be on three-month release cycles hoping that GPT/Claude-Next's new SOTA on public benchmarks would translate into wins on our private evals. The only way to compound on data would be through prompting, muddling more and more context into the first user message. An ecosystem of exclusively frontier models no longer makes sense where the following trends trends have taken foot: (1) to improve capabilities across the board (FrontierCode, GDPVal, etc.), general intelligence requires a scale that is extremely expensive to serve; (2) there's no free lunch in upgrading to the newest model as scarce GPU compute has driven costs up (see the recent Anthropic and Google deals to serve on Colossus); (3) training a state-of-the-art model on just your own tasks is possible as frontier training infrastructure is now available to the public. The new architecture will combine "generalist" models with "company veteran" models that improve the same way that star human performers do: through learning from experience operating inside of your institution. The technical stack looks something like the following: (1) You'll need to automate how you transform production data into private RL environments. This means transforming unstructured data into a curriculum a model can learn from that looks like what happened in prod: e.g. replicating a SEV by mocking the state of a production database when it happened, with un-hackable graders that are aligned with what you care about in production. (2) Private RL environments need a post training stack to be useful. Model weights/checkpoints trained on these environments will participate in the cadence of traditional software release cycles. (3) Inference endpoints will serving production traffic become "alive" as they become attached to a training runtime. Each new batch of data produces environments that are inputs for the next training step. Each step produces a new release candidate for production; if it passes the A/B test, you'll do a rolling weight update to models that serve higher quality tokens for your customers.

Satya Nadella

@satyanadella

21h

x.com/i/article/206558289479…

20,965

Applied Compute

Lan Jiang retweeted

Applied Compute

@appliedcompute

Jun 1

Your data is your edge, but only if your AI is built on it. Rent a generic model and so can your competitor. The companies with an edge are deploying custom models that they own and improve over time. Our co-founder @rhythmrg recently stopped by @southpkcommons to share how companies are owning their intelligence with Applied Compute.

0:24

118,815

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 28

People are starting to realize the value of post-training and RL

237

35,625

Josh Wolfe

Lan Jiang retweeted

Josh Wolfe

@wolfejosh

May 27

Replying to @Lux_Capital @generalcatalyst @8vc

One of the most brilliant teams Lux has continued to invest more to fuel them in every round possible! Thank you @ScottWu46 for your talent magnetism as a founder leader and for the bar of brilliance and momentum you’ve set for @cognition !

6,572

Cognition

Lan Jiang retweeted

Cognition

@cognition

May 27

1/ We’ve raised over $1B at a $26B valuation, led by @Lux_Capital, @generalcatalyst, and @8vc. Our enterprise usage has grown >10x since the start of this year, and our run-rate revenue grew to $492 M. We launched Devin two years ago as the first AI software engineer. Since then, cloud agents have gone from niche to mainstream, and today they are the fastest growing way to create software.

165

194

2,464

870,438

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 27

Huge congrats to the team! Relentless execution and focus ftw

Cognition

@cognition

May 27

4,436

Malhar Patel

Lan Jiang retweeted

Malhar Patel

@malharhar

May 27

I wanted my first public video to be learnings from the past 7.5 years here at @AppliedInt. Hopefully it’s useful to the next set of builders 🐻

31:21

171

57,141

Pranav Vaid

Lan Jiang retweeted

Pranav Vaid

@pranav_vd

May 22

behind the scenes at AC, featuring what we now call the pinapple shirt excited to share some of the work @raymondmfeng and I have done, RMSD has proved super valuable in our customer engagements!

Applied Compute

@appliedcompute

May 22

Some enterprise tasks are challenging to hill-climb with RL-based methods since they involve very out-of-distribution behavior. On-policy self-distillation (OPSD) gives a model learning signal for every token it writes, far richer than the single scalar reward of RL. But that channel is noisy: most tokens don't reflect the behavior you're after. We introduce Relevance-Masked Self-Distillation (RMSD), which uses a two-step filtered loss mask to cut through the noise and find the tokens with the highest signal. Compared to OPSD it trains more stably, provides higher data efficiency, and reaches a higher performance ceiling.

0:05

4,928

Erik Bernhardsson

Lan Jiang retweeted

Erik Bernhardsson

@bernhardsson

May 21

Today we're announcing our Series C funding: $355M at a $4.65B valuation, led by some great investors @generalcatalyst and @Redpoint. We've had insane growth in the last year, but we're still very early. So proud of the team and what we have built so far!

0:46

Modal

@modal

May 21

x.com/i/article/205723780724…

127

1,456

584,246

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 21

Huge congrats to @bernhardsson @akshat_b and the whole @modal team! In an age where so many teams are moving to own their own models, Modal provides powerful primitives we use for training and serving models at scale.

Erik Bernhardsson

@bernhardsson

May 21

0:46

4,234

Modal

Lan Jiang retweeted

Modal

@modal

May 20

Frontier models set the floor. Specialized models raise the ceiling. With Modal, @AppliedCompute is training custom agent workforces for companies like DoorDash, Mercor, and Cognition.

0:06

126

37,435

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 17

Exactly! The winning strategy is not betting on who has the best model this month. It is building the deployment layer where intelligence actually compounds. That means serving the best possible agent tokens on durable infrastructure: route to any model, train your own when it makes sense, and own the context, harness, environment and interfaces around the agent. Applied Compute is building this customer-first deployment layer. We help customers build intelligent systems where the value compounds on their side.

Chamath Palihapitiya

@chamath

May 17

If you are running a consulting business and you are deploying Anthropic or OpenAI directly into your organization (I’m looking at you PwC and Accenture) you are letting the fox into the hen house. OpenAI and Anthropic are openly funding and starting competitors to you while also using your usage to drive more success for them. This is not a failure on their part but a failure on your part. Consulting businesses that understand this are adopting a control plane that allows them to arbitrate where tokens go and who generates tokens for them. Controlling the tokens is controlling the spice (Dune). This was a key pillar of 8090’s global partnership with EY and they key feature of our Software Factory. We control token generation and can direct them to any model provider. We are close to another global partnership and will announce it soon. These organizations refuse to accept the disruption standing still or, even worse, by adopting and accelerating the companies who want to disrupt them.

173

33,057

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 16

The real power of forward deployed engineering has always been putting strong technical people directly alongside the operators who own the outcome. That proximity forces the work to solve the actual problem instead of some sanitized version of it. In the AI era this principle has become even more valuable. Agents can now sit inside real workflows and improve from actual decisions, which means the highest-leverage work is extracting the tacit knowledge that lives with subject matter experts, building evaluations that reflect how things actually break, and closing the production feedback loop so agents get better from real outcomes.

280

198,949

Citrini

Lan Jiang retweeted

Citrini

@citrini

May 13

For the better part of 4 years, I’ve considered myself reasonably early to emerging sub-themes in AI and Robotics. Every time I begin researching a new one, I encounter promising private companies and, without fail, @GavinSBaker or @wolfejosh are already investors in them.

1,773

218,191

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 12

Harvey is a great example of a company carving out a strong competitive position by building proprietary intelligence We had a great experience teaming up with them to support their new Legal Agent Benchmark with post-training and eval methodology Thanks @gabepereyra for visiting during our team all-hands today to break down what proprietary intelligence looks like in law!

134

18,918

Jacob Teo

Lan Jiang retweeted

Jacob Teo

@jacobtpl

May 12

seeing all the old house photos really brings me back. lots of fun memories and late night gym sessions... wish I took more pictures but all I have is us trying to recreate the cognition logo with dumbbells

Colossus

@colossusmag

May 11

Scott Wu is the co-founder of Cognition AI, one of the fastest-growing companies in history. He’s also the greatest competitive programmer the US has ever produced. You may have seen him doing impossible card tricks and mental math. You’ve never seen him asked about weed, Michael Jordan, cancer, and human consciousness over a punnet of strawberries. That is what Colossus editor-in-chief Jeremy Stern did on a recent visit to San Francisco. For those less familiar with @ScottWu46: In 2nd grade, he entered a math competition for 7th graders, lost, and was so furious he still fumes about it 20 years later. The next year he entered the 9th-grade division as a 3rd-grader and got a perfect score. Then he won first place at the US national middle-school math competition and three straight gold medals at the International Olympiad in Informatics, where he became the greatest American gold-medalist and coach in history. Most of the people running the biggest AI companies met as teenagers, competing for their countries on international math and science teams. OpenAI’s Greg Brockman, Anthropic’s Dario Amodei, Meta’s Alexandr Wang, to name just a few. Most agree that the von Neumann among them was Scott Wu. In November 2023, a few weeks after his mother died of lung cancer, on the day Sam Altman was fired from OpenAI, Wu founded his own AI company: Cognition. He was 26 and saw earlier than almost anyone that AI would converge on agents that work in the background, 24/7, like coworkers. He shipped Cognition’s AI software engineer Devin in March 2024. It worked poorly, and he took intense public criticism for it. Now, in its first 18 months of service, Devin has generated $445 million of revenue run rate and usage has doubled every eight weeks. The US Army, Goldman Sachs, and Mercedes-Benz are all customers. Cognition is raising at a valuation around $25 billion. @JeremySternLA sat down with Wu, the emperor of the nerds, to ask the questions we’d all ask one of the smartest people in America—building the most consequential technology of our generation—if we ever got the chance. As well as MJ and weed, they talk about the cluster of competitive math prodigies behind so much of AI, what makes us human when AGI arrives, and why Wu believes he was put on this earth to teach AI how to code. Read the piece below.

6,643

Linden Li

Lan Jiang retweeted

Linden Li

@lindensli

May 12

We started the company knowing that, despite remarkable progress on public frontier models, there was a frontier that had not yet been explored. The destination was clear (finding ways to leverage data, internal processes, and knowledge built up over many decades to produce systems that get better over time), but we didn't have the infrastructure to get there. The "private frontier" belief has played out more now, as the winners of this era will get there by honing their internal intelligence every day.

Yash Patil

@ypatil125

May 12

x.com/i/article/205400317244…

3,745

Yash Patil

Lan Jiang retweeted

Yash Patil

@ypatil125

May 12

x.com/i/article/205400317244…

138

25,916