MTS @thinkymachines. previously pre-training @googledeepmind, @character_ai, and @aiatmeta.

Joined February 2008
147 Photos and videos
RT @Nick_Davidov: The biggest bullshit move by DHS in its history. So everyone on a O1 or H1B visa would have to stop working legally in th…
3,015
2
Stephen Roller retweeted
.
4
166
684
12,355
Stephen Roller retweeted
Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training: scheduling, storage, networking, reliability, and distributed systems at scale. Hiring in NYC and SF job-boards.greenhouse.io/thi…
29
34
605
61,038
Stephen Roller retweeted
People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/int…
464
1,961
15,789
7,752,397
This is the kind of shit Godspeed You! Black Emperor would use as an album cover with a title like "And the Holy Flame Blinded us for 333000 Years Pt. II"
51
715
6,674
182,449
Stephen Roller retweeted
DeepSeek-V4 uses our Hash routing approach developed back in 2021 -- see screenshot of their tech report! (Looks like a great model, congrats!) Bonus note: our same blogpost (& paper) back in 2021 also introduced 'looped transformers', but we called that staircase & ladder (see screenshot): parl.ai/projects/params_vs_c… huggingface.co/deepseek-ai/D…
38
457
31,690
Didn't have hash layers on my bingo card
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/D… 🤗 Open Weights: huggingface.co/collections/d… 1/n
2
2
26
3,496
Stephen Roller retweeted
me: Make me the most AI slop image that ever AI slopped. The pinnacle of slop. A seminal work on AI slop. ChatGPT Images 2.0:
209
199
2,591
931,330
Stephen Roller retweeted
2
140
1,275
16,397
Stephen Roller retweeted
Long context windows are now available for select models on Tinker! - 128k tokens for Kimi K2.5 and GPT-OSS-120B - 256k for Nemotron 3 Super 120B and Qwen3.5 397B. For more details and pricing, see our full model lineup: tinker-docs.thinkingmachines…

2
6
134
12,756
Stephen Roller retweeted
I heard ASL-5 is when the Claude code TUI stops flickering in tmux
2
16
1,585
Stephen Roller retweeted
this is what actual national suicide looks like btw
52
1,037
10,685
934,162
Stephen Roller retweeted
I couldn't agree more. This is exactly why we founded @datologyai
7
51
8,097
Stephen Roller retweeted
Grateful to Jensen and @nvidia team for their support. Together, we’re working to deploy at least 1GW of Vera Rubin systems, bringing adaptable collaborative AI to everyone. thinkingmachines.ai/nvidia-p…
167
279
3,870
561,045
Stephen Roller retweeted
times are hard. had to teach an AI researcher how to use kubernetes today
31
20
1,109
76,529
Stephen Roller retweeted
things are gonna get weird. you must get commensurately weird.
37
59
567
30,132
Stephen Roller retweeted
41
456
3,663
104,002
A fully-loaded nvl72 rack weighs 3,300lbs. Meaning the per-gpu weight is ~45lb, or one standard olympic plate.
1
11
1,077
Stephen Roller retweeted
Our second roundup of community projects highlights all things RL, from tutorials to APIs to cutting-edge research.
3
12
175
105,969
Stephen Roller retweeted
We’ve loved watching the Tinker community grow, and we're excited to have a place to share product updates, helpful recipes, and spotlights on the amazing things Tinkerers are building. Get started with Tinker here: thinkingmachines.ai/tinker/
8
21
178
137,793