Joined April 2013
15 Photos and videos
The untrainable is the sexy name, the boring names : tacit knowledge, subjectivity, higher order thinking, (seems like even taste is out of the door now). We love to make it sound so easy to eval and verify. This might be true in coding but the reality is never just that. 1. Think about those AI outreach emails you've been getting. Grammarly correct, a lot of them use this "fake lowercase style so it doesn't look like ai". All of them are bad. I know when I say one but I cannot articulate a clear criteria on why. How good are the top models at generating them? Same goes to AI comments. Because style is the hardest to eval or even articulate. 2. A lot of ai-consulting work is really helpful define or transfer that judgement on good, or good enough. 3. I recently talked to an ai software agency. Their entire pitch is on functional, matching spec. Nothing about it'll be built well. I think we are still a year away from actually passing the benchmarks. Unknown what that looks like, but since it's all about the unarticulatable. It'll be a lot less visible.
3
381
1. I spent a lot of time at scale labeling data myself, never thought it was beneath me. Instead it's how we developed quality criteria, instructions, how we provide partnerships to our customers. My co-founder @flubtitle and I built out new labeling products for LLMs in 2023 (well it's old now) because we did labeling, ran queues (meaning we were running real projects and needed to deliver data). Not because we got a prd from anyone. 2. One of the first things we built at Santori Labs is our voice-first eval/label flow and a roleplay system. I spent hours every week going through data, thinking about what is good vs not. 3. Imagine an engineer who thinks they are too good to do that, but instead they are just here to execute a prd that is given to them. 4. I don't think data is all you should do, but it's still one of the most important things you can do. Labeling is one form, another one is looking at agent traces. If you don't see why that's important, you are stuck in the past. 5. It's painful looking at data. You think you just look at it and you just know if this is good. It's never that. It's always the messy middle of "meh". That's why the design principle for our own data flow is that: data is a focused act, and the product needs to encourage focus
Just learned: Software engineers used to do manual data labeling at Scale AI while Alex Wang was CEO. After he left, new leadership joined, and were HORRIFIED to learn this. Stopped it ASAP Now at Meta, software engineers are assigned manual data labeling... see the pattern?
3
2
31
3,887
I was reminded that slop has long existed before ai - we just used to call it "corp speak" and it was mostly from corporate executives
3
239
There will always be people who care about understanding the code, the architecture, makes sure it works well, vs people who just want it done and move on. And this had nothing to do with ai
3
112
Onboarded a new user today : "This was a remarkably non-slop conversation. Good job!" 😍😍😍 Carrying on!
105
So that's what happened 🫢. Too bad, I bet lots of companies are migrating to AWS, definitely not to Google Cloud
May 19
Google Cloud has blocked our account, making some Railway services unavailable. We have escalated this directly with Google. The Railway Platform team has since confirmed access to Google Cloud and is working on restoring access to all workloads. We have access to some of our Google Cloud–hosted infrastructure and are working to restore the rest of the service. We apologize for the disruption.
179
We spent 100 years making humans behave more like machines. Deterministic. Low variance. A cog in a chain. We called it Taylorism. We called it scientific management. And every modern organization is modeled after this. Now LLMs are becoming like humans. Probabilistic, unreliable, high variance. Terrible as a cog in a chain. So we now build harness engineering around them and call it AI native organizations. Can't wait for 2027's buzzword.
2
126
Probably the most ridiculous thing I've seen in a while. All of the faces are just AI generated slop. Are we really at this point that we aren't bothered with pulling real images? Or is this really how Asian faces look alike to people?
2
90
1 Oct 2025
👇 just as good a day as any
107
20 Sep 2025
I meant, why not just build it directly? or better , skip all the middle-layer and give me the money?
1
143
4 Sep 2025
Wow I'm surprised but I understand. Surprised that they didn't get acquired by Openai or any of those AI players. Atlassian seems to be in their last round. I understand because paradigm shifts tend to send everyone back to square one, which I wrote about this back in May. There will be others. More later stage ones in particular. I'm glad to hear that they can really focus on just building out @diabrowser
4 Sep 2025
The @browsercompany just signed a merger agreement to be acquired. We will remain independent. Our focus is Dia. I’ve written and rewritten this post more times than I’d like to admit, but what I keep coming back to is simple: the work continues, and we’re grateful for this moment. The work continues because when I stop by the coffee shop near our office, nobody is using Dia yet. Our “internet computer” vision hasn’t been realized. Dia hasn’t yet changed how you work on a Tuesday morning. This deal is about giving us the resources, distribution, and monetization muscle to get there. At the same time, it feels disingenuous not to pause and briefly celebrate this milestone. It reflects our team’s craftsmanship and relentlessness, the support of our coaches, board members, and advisors, and the incredible effort from our deal team: Ryan Purcell from Gunderson, Nancy Peretsman and Leah Schwartz from Allen & Co., and Clare, Abby, Eissra, Rebecca, Cory, Nash, and Hursh from The Browser Company. Most of all, we’re grateful for what this means for Dia. It means we can hire faster, ship faster, and bring Dia to more people. We can now invest in cross-platform support and secure syncing, train custom AI models designed specifically for Dia, and turn ambitious ideas about “computer use” and “memory” into reality. To everyone who’s filed a bug, sent feedback, or shared a kind word: thank you. We haven’t always gotten it right, but we’ve always cared deeply. That will never change. Dia isn’t going anywhere. We’ll be here for the long haul, with the same team just a new partner helping us push further. We’ll take a breath this weekend, and then get back to work. Big launch next month. In the meantime...
1
2
391
4 Sep 2025
Wrote this reflection back in May in building products during paradigm shifts I continue to root for good products with amazing tastes open.substack.com/pub/though…

1
1
53
14 Jul 2025
"We’re doing so in a way that treats the team with the value and respect that they deserve." cognition's employer brand just 10x'ed. Love this for the @cognition_labs team and @windsurf_ai team.
Seeing lots of questions like: wait, I thought Windsurf was already acquired? What is Cognition buying? Let me explain. Windsurf the company is an *extraordinary* asset. It was missing its founders and research team, but it has a beloved product, valuable IP, an incredible business ($82M ARR with enterprise growth doubling quarter-over-quarter), known brand, and most importantly: a world-class team in every function—GTM, enterprise engineering, and much more. With today’s news, we’re adding all that firepower to Cognition to deliver the most complete AI coding solution in the market. And we’re doing so in a way that treats the team with the value and respect that they deserve. And here’s what’s also ours: - all improvements we build on top of Windsurf’s IP from here - all Windsurf training data - all Windsurf trademark and brand assets The meme over the weekend was “Is Windsurf now an empty shell?” The opposite is true, and we’re going to be even stronger together. Today is a huge win for Windsurf and Devin customers everywhere.
2
221
1 Jul 2025
RL is just getting started, but higher order thinking—the why behind human actions—are the biggest data gap towards more autonomous agents. Before Google Search: you dig through categories (think Yahoo directories, library catalogs), matching your need to rigid buckets—not your real intent. After Google Search: you could ask in freeform, but systems still just see your keywords, not the real goal (“pottery artist near me” hides “birthday gift for mom”). The core reasoning stays invisible. What we need now is human reasoning data: not just actions, but the why behind them. People aren’t trained to make this explicit, so AI keeps learning from the surface layer. This is why we started @santorilabs but we are not interested in selling these data to labs. Gonna start to write abt our thesis on this.
Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to saturate any evaluation. This means that the primary barrier to applying agents to the entire economy is building evals for everything. This will be one of the largest buildouts we have ever seen with enterprises pouring hundreds of billions of dollars into evals for every workflow we want agents to automate. We're quickly defining a new class of work and hiring across nearly every domain: software engineers, consultants, bankers, lawyer, doctors, gamers, and many more.
1
233
27 Jun 2025
Let's not forget that @cursor_ai didn't spend a single dime on marketing until sometime this year. Good product still works
1
98
13 Jun 2025
Truly the end of an era. Congrats @alexandr_wang on your next chapter! And thank you for bringing together such an amazing group of people to run through walls together.
My note to Scale employees today—
6
365
11 Jun 2025
Scale ai 101 (my wall is full of wrong info): 1. Not Philippines. The game has already changed to PhDs, competitive programmers, a k.a. extremely expensive domain experts. 2. Ask your VC friends which company they think has the best founders. Scale is likely in the top 3 3. The era of experience still requires humans. As we stand today, paying people to give you feedback is 1000% more effective than your vanilla users Ama for the next 4hrs, I used to run that biz unit. And ofc I'm biased, but it doesn't make them not true.
3
1
15
68,739
11 Jun 2025
I'm already speculating about the next pair. Time to get on polymarket
10 Jun 2025
Replying to @pitdesi
a new challenger has appeared
1
590
6 Jun 2025
Stuck on a long-haul flight for today and the wifi is too bad for real work but not for X. Lucky girl indeed
2
119