Joe Fenton

Joe Fenton

68 Photos and videos

Tweets

Pinned Tweet

Joe Fenton @JoeFenton

22 Aug 2024

“All that matters for anyone in life is their family, their health and that’s always the same for everyone” — Mike Lynch bbc.co.uk/sounds/play/p0jkc9…

958

Joe Fenton

Joe Fenton @JoeFenton

Jun 8

Business idea: > Free holiday in Maldives for mathematicians > In exchange for 10 math problem per day > Sell the problems to AI labs > Profit Who's in? First 50 people only have to produce 8 problems/day. Inspired by Benchmarks in Leipzig arxiv.org/abs/2606.05818

Benchmarks in Leipzig

Between April 1 and May 15, 2026, a group of 49 mathematicians compiled a dataset of research-level mathematics questions with known answers. Most of the work was done during the 3-day workshop...

arxiv.org

324

Joe Fenton

Joe Fenton @JoeFenton

Jun 3

Positive reaction to the mai-thinking-1 tech report is more than I imagined. Some nice write-ups from the open research community

860

more replies

Joe Fenton

Joe Fenton @JoeFenton

Jun 3

x.com/HarveenChadha/status/2…

Harveen Singh Chadha

@HarveenChadha

Jun 3

MAI-Thinking-1 by Microsoft looks to be approaching sonnet level model, the 109 page tech report is gold they got 29T unique tokens without any synthetic tokens for pretraining which is exact opposite of what they were doing with phi models !! so many counter intuitive decisions but the best part is they talk a lot about data.. this is a must must read

125

Joe Fenton

Joe Fenton @JoeFenton

Jun 5

x.com/askalphaxiv/status/206…

alphaXiv

@askalphaxiv

Jun 4

"MAI-Thinking-1: Building a Hill-Climbing Machine" Microsoft just did something almost no frontier AI lab has done before They shared how they engineered the data behind a frontier-scale model in unusual depth. From data collection and eval decontamination, to data mix scaling, this paper lays out how they managed 30T pretraining tokens plus 3.55T midtraining tokens Surprisingly, they also used no third-party distillation and no open-source training datasets The model itself is not a jaw-dropping release, but the paper might be the best open look yet at a frontier-scale data factory and hill-climbing loop.

Joe Fenton

Joe Fenton @JoeFenton

Jun 3

x.com/nrehiew_/status/206201…

@nrehiew_

Jun 3

Super detailed tech report for MAI-Thinking-1, with a ton of info on all stages of the pipeline. I'm surprised so much of this info is released :) Super long thread on my notes:

Joe Fenton

Joe Fenton @JoeFenton

Jun 2

> buy truckloads of good books > remove unspeakable amounts of slop from web data > build a shedload of held-out evals that was my work on mai-thinking-1 the model gets 97% AIME and I can speak for hours about ISBNs read the tech report: microsoft.ai/wp-content/uplo…

290

Joe Fenton

Joe Fenton @JoeFenton

Apr 8

Anthropic achieves escape velocity - question is, who will be next...

487

Joe Fenton

Joe Fenton @JoeFenton

Jan 18

Qualitatively observed the same among AI researchers. The most successful are often exceptionally strong in seemingly orthogonal areas. Stay general kids…

335

Joe Fenton

Joe Fenton @JoeFenton

Jan 13

Wondering if OpenAI falls into this category…

Joe Fenton @JoeFenton

14 May 2024

"Invest in companies that would be happy to see a 100x improvement in foundation models" -- paraphrasing Sam Altman

328

Joe Fenton

Joe Fenton @JoeFenton

Jan 12

This is going to be insanely popular

Claude

@claudeai

Jan 12

Introducing Cowork: Claude Code for the rest of your work. Cowork lets you complete non-technical tasks much like how developers use Claude Code.

1:09

380

Joe Fenton

Joe Fenton @JoeFenton

Jan 9

Build Jarvis with Claude Code, 100 lines of python and iMessage: * Script watches Messages DB for texts from your number * Forwards to Claude API with tools (shell, chrome, email) * Claude executes replies via iMessage * Run as Launch Agent on always-on Mac

267

Joe Fenton

Joe Fenton @JoeFenton

29 Dec 2025

2025 is peak “let’s ship a wrapped feature”

158

Joe Fenton

Joe Fenton @JoeFenton

3 Dec 2025

State of foundational models according to Joe bench: * Gemini 3 Pro is benchmark maxed - often can’t answer basic questions. * GPT-5 templated responses and incompleteness let it down. * Claude Opus/Sonnet 4.5 are goat across every category - coding, finance, law, fitness, EQ…

597

Joe Fenton

Joe Fenton @JoeFenton

18 Nov 2025

Good move

Anthropic

@AnthropicAI

18 Nov 2025

We’ve formed a partnership with NVIDIA and Microsoft. Claude is now on Azure—making ours the only frontier models available on all three major cloud services. NVIDIA and Microsoft will invest up to $10bn and $5bn respectively in Anthropic. anthropic.com/news/microsoft…

374

Joe Fenton

Joe Fenton @JoeFenton

28 Oct 2025

👀 pokerbattle.ai/

226