Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

658 Photos and videos

Tweets

Pinned Tweet

Mahesh Sathiamoorthy

@madiator

28 Jan 2025

We are announcing Open Thoughts, our large-scale open-source effort to curate the best open reasoning datasets! DeepSeek-R1 is amazing but we still don't have access to high-quality open reasoning datasets. These datasets are crucial if you want to build your reasoning models! Bespoke Labs released a 17k reasoning dataset last Wednesday, and the reception has been phenomenal (it's trending on HF). So we are joining forces with the Datacomp community to launch Open Thoughts --- an open data, open model, and open code initiative for creating the best open reasoning datasets and the associated models. Along with this, we release OpenThoughts-114k reasoning dataset and the associated OpenThinker-7B model. Links to the code, model, and data are below in 🧵.

287

1,811

233,616

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

18h

I always try to think of my own time as to whether I am creating something vs. consuming something.

@AlphaWizarDD

Jun 13

Jeff Bezos bought a superyacht. Mukesh Ambani built Antilia. Zuckerberg built a Hawaii bunker. Elon sold all his mansions, lives in a 375 sq ft prefab box worth $50,000 near a rocket launch site in Texas. And just became the world's first trillionaire. Consumption vs Creation. The builder always wins.

961

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 11

We moved into a new (and larger) office recently with lots of sunlight. Nice mini-milestone to celebrate :)

5,502

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 10

Chapter 4 over. Turn the page to Chapter 5.

2,288

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 8

I make my writing unsummarizable that if you take any words out you lose interesting ideas.

Paul Graham

@paulg

Jun 5

I strive to make my writing unsummarizable, in the sense that it has so little fluff left in it that if you take any words out, as summaries by definition do, you lose a lot of interesting ideas.

1,870

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 8

Loop engineering is all you need!

Mahesh Sathiamoorthy

@madiator

May 24

Software is eating the world. AI is eating the world Attention is all you need RAG this, RAG that. Agentic this, Agentic that. Context engineering is what you need. RAG is dead. Long live RAG. Harness engineering is what you need. Harness is the backend. What did I miss?

1,673

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 7

We will be automating post training with post training.

Susan Zhang

@suchenzang

Jun 7

if your bread-and-butter consists solely of: - tuning hyperparams/config files - fitting points on a log-log plot - tweaking a few lines in model.py, transformer.py, optimizer.py, train.py - waiting a week for <= 512 chips to free up and then another week for loss curves to converge it is completely understandable to be stressed about becoming automated into irrelevance within the next year or so. question is, do you wait for that to happen, or do you start doing something differently now?

9,996

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 5

Ok this is odd..

Jinyan Su

@SuJinyan6

Jun 4

It took me two weeks to onboarding for my internship at Microsoft, somehow, my collaborators relies too much on the agent to fix everything for them, such that they can't explain how things really works or have some basic understanding in what's blocking me. Though they offered to meet in person and help me "solve the problem together", and when we meet, they prompt their agent while I ackwardly watching with them or they asked me to prompt my agent while they ackwardly watching with me. Why don't I just prompt my agent for everything to save both of our time? Both the learning and human interaction are missing gradually.

2,464

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 4

Nice to see @AlexGDimakis's advisor model idea getting adoption in the industry.

Fireworks AI

@FireworksAI_HQ

Jun 3

Frontier models are powerful advisors. On @harvey's Legal Agent Benchmark, a GLM 5.1 worker using Claude Opus 4.7 as a sparse advisor reached 18/100 all-pass versus 14/100 for Opus alone, at 39% of the cost. More on the harness design, advisor pattern, and training results: fireworks.ai/blog/open-sourc…

2,383

Hanna Hajishirzi

Mahesh Sathiamoorthy retweeted

Hanna Hajishirzi

@HannaHajishirzi

Jun 2

MAI-Thinking-1 is out! Excited to share what we are building and how climbing from scratch (no distillation) actually works: simple recipes, rigorous science, self-distillation, patience, and great infra. Check out our tech report has the full story of our RL climbs. microsoft.ai/wp-content/uplo…

Mustafa Suleyman

@mustafasuleyman

Jun 2

Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier. First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks. - It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities. - It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks. - And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end. Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2 on the leaderboards, surpassing the score of Nano Banana 2 on image editing. Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI. - Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost. All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat. Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost. Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare. Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog: microsoft.ai/news/building-a…

127

870

122,551

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

Jun 2

Welcome Jackson to Bespoke. Jackson has done incredible work with SREGym and we are happy to host him this summer!

Jackson Clark @HacksonClark

Jun 2

I'm excited to share that I'll be @bespokelabsai this Summer building out some exciting RL environments! Huge thanks to @madiator and @AlexGDimakis for the opportunity. Excited to work with you all! :)

1,438

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 31

I predict that just like Google had Chromebooks with ChromeOS where pretty much Chrome was the only thing, we will have Codexbooks and Claudebooks in the future. When you open the lid of the laptop you will be greeted with a simple interface that asks what you want to get done.

Greg Brockman

@gdb

May 31

GPT Realtime 2 unlocks some real magic:

1,018

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 30

This is new..

2,183

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 30

We are entering the era of bespoke models.

Techmeme

@Techmeme

May 28

Kirkland & Ellis, the world's highest-grossing law firm, is setting aside $500M to build its own AI platform rather than rely on tools available to its rivals (Financial Times) (Visit Techmeme dot com for the link and full context!)

3,138

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 25

And I think it's fine. He never claimed to be an expert. He keeps learning new things, interviews people who are experts, and shares with others who know very little and want to learn. In fact that's his strength. If he was a deep expert, those conversations may not be all that accessible.

ellington

@not_ellington

May 25

This episode shows me how insanely little Dwarkesh knows about hardware and has made me second guess his intelligence on the other levels of the abstraction stack. Also the dude lecturing is not communicating very well. This whole episode is very clearly an ad for MatX and a poor one at that because the founder clearly has certain gaps in his hardware knowledge

3,510

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 24

CEOs are the most delusional. Detached from reality.

Michal Malewicz

@michalmalewicz

May 23

CEOs are the most delusional about AI. Detached from reality.

1,595

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 24

This is kind of insane..

Siddhartha Saxena

@siddsax

May 24

Anthropic onboarding day: Michael Scott introducing Karpathy like he just signed Wemby in free agency.

1:43

2,131

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy retweeted

Mahesh Sathiamoorthy

@madiator

May 24

3,271

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 23

There were many many educators on YouTube who were teaching you how to become very rich by selling on Amazon (drop shipping). And one should rightfully wonder why didn't they just do it themselves rather than selling the idea to others. Because the idea never really worked otherwise they wouldn't be giving it away. This company is also selling courses it seems (sure, maybe it will sell tools), but you run into similar arguments here. Anyway I am not a fan of single CEO thing. Why not share your happiness and pain with others? If money is the only thing that motivates you to start a company, makes sense. Otherwise, working and toiling with people is such a better experience and also a meaningful thing to do. Also ironically polsia is aislop backwards.

Ben Cera

@Bencera

May 22

Polsia just raised $30M at a $250M valuation. Approaching $10M annual run rate. One Founder AI. Zero employees. Polsia runs companies autonomously. It also ran its own fundraising. I just showed up for signatures.

1:25

2,589

Mahesh Sathiamoorthy

Mahesh Sathiamoorthy

@madiator

May 23

Dwarkesh RL Environments

Sholto Douglas

@_sholtodouglas

May 22

Now - starts doing blackboard lectures Next - starts hosting in studio audiences for lectures ... - Dwarkesh university?

6,088