Lisan al Gaib

Lisan al Gaib

370 Photos and videos

Tweets

Tomás Puig retweeted

Lisan al Gaib

@scaling01

Jun 11

Fable 5 refused 200 out of 200 ProgramBench tasks lmao

125

184

5,167

409,379

Tomás Puig

Tomás Puig @tomascooking

Jun 11

The Alembic racks are always a good sight.

0:10

gfodor.id

Tomás Puig retweeted

gfodor.id

@gfodor

Jun 10

Anthropic read the Three Body Problem and decided the best idea in the whole trilogy was the sophon lock

756

31,009

Mark Saroufim

Tomás Puig retweeted

Mark Saroufim

@marksaroufim

Jun 10

We consume data we did not create. We inherit tools we did not invent. We run on chips we did not make. But when the commons bears fruit, we fence it.

elie

@eliebakouch

Jun 9

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy

1,113

37,782

tomie

Tomás Puig retweeted

tomie

@tomieinlove

Jun 10

I plan to live Anthropically. If someone asks me about something I don't like I'll just become a stupider version of myself

362

4,553

149,638

Tomás Puig

Tomás Puig @tomascooking

Jun 6

All the love that I see Muon getting often seems to come from people that don’t have to be concerned about cluster efficiency. ADAM is the king of keeping life more sane for distributed systems.

124

Tomás Puig

Tomás Puig @tomascooking

Jun 2

Every founder I know has a list… it lives rent free in our heads.

Sam (Knicks in 5)

@futurenomics

Jun 2

Anthropic’s last round was apparently a bloodbath behind the scenes. A GP at a prominent fund had dinner with Dario three times before their allocation was slashed to zero. At least four other tier-one funds got pulled at the last minute. Their crime? Passing on the Series B, the hardest round Dario ever had to raise (led by Spark). In venture conviction is all that counts.

333

Tomás Puig

Tomás Puig retweeted

Tomás Puig @tomascooking

May 28

Replying to @elonmusk

I don’t quite get the flex here. Our in-house version of this has been finished for a while. In fact the next step hopefully is that they’re doing family tree like SHARP and NVSHMEM with in-box, in-rack, and cross-rack mapping. Then you get all-reduce out of the GPU completely and onto network. This gets pretty hairy because you do need to make sure you’re on NCCL 2.28 as they only allowed auto shrink of the SHARP tree after. If they’re on NetworkX and not infiniband then no SHARP. Sync hiding if the all reductions becomes even more key. I’m also curious what they’re maxing their Tensor Parallelism and Data Parallelism at. There is an inherent internal pressure of optimizer steps for over training and TP / DP. Assuming NVL72 racks and you max TP nets out to realistically 64 a rack with the additional trays as run spares and orchestration. In real world practice though you rarely take TP above 8-16 since network traffic would crush you without perfect async. I do think the trend to NetworkX will come back to bite people a bit since you’re sacrificing 10-30% GPU flops by refusing to offload it to the network. Unspoken secret… the hard part isn’t writing the software… the hard part is writing the network warm up routine on the machines and nics once that software is available.

197

clem 🤗

Tomás Puig retweeted

clem 🤗

@ClementDelangue

Apr 14

Introducing Kernels on the Hugging Face Hub ✨ What if shipping a GPU kernel was as easy as pushing a model? - Pre-compiled for your exact GPU, PyTorch & OS - Multiple kernel versions coexist in one process - torch.compile compatible - 1.7x–2.5x speedups over PyTorch baselines

0:07

221

1,640

208,353

jeffrey lee funk

Tomás Puig retweeted

jeffrey lee funk @jeffreyleefunk

Apr 11

We've been tricked, again. Many of the thousands of bugs and vulnerabilities Mythos found are in older software are impossible to exploit. And the severe zero-day reports rely on just 198 manual reviews tomshardware.com/tech-indust…

Anthropic's Claude Mythos isn't a sentient super-hacker, it's a sales pitch — claims of 'thousands'...

Many of the "thousands" of bugs and vulnerabilities it found are in older software, or are impossible to exploit.

tomshardware.com

234

849

7,294

824,075

Nicholas Roberts

Tomás Puig retweeted

Nicholas Roberts

@nick11roberts

Apr 6

That new LFM2.5-350M is super overtrained, right? And everyone was shocked about how far they pushed it? As it turns out, we have a brand new scaling law for that! 🧵 [1/n]

362

67,841

Tomás Puig

Tomás Puig @tomascooking

Feb 27

AI news in a nutshell today

ALT Dr Strangelove War Room GIF

208

Tomás Puig

Tomás Puig @tomascooking

Feb 9

As a Latino founder nice to finally see a Latino Super Bowl half time show live. P.S. the trees were actual people!

0:06

3,458

Villy

Tomás Puig retweeted

Villy @villa__que

Feb 9

Estuvo hermoso, todos los latinos lo sentimos #Halftime #SuperBowl

995

Tomás Puig

Tomás Puig @tomascooking

Jan 23

Every single time I have to fly @united I’m reminded why I switched all status to @Delta.

165

Tomás Puig

Tomás Puig @tomascooking

Jan 20

One of the speaking engagements at Davos.

211

Tomás Puig

Tomás Puig @tomascooking

Jan 19

I’m speaking alongside the Davos World Economic Forum in Switzerland this week. First time here and so very curious what it will be like. My talk is Tuesday and hosted at the Forbes stage on “Causal AI and the New Rules of Decision & Power”.

106

Tomás Puig

Tomás Puig @tomascooking

Jan 5

NVIDIA CES keynote is “shots fired” at all specialized, non-general first, models. First NVIDIA creatES open general models, partners all do RL and specificity, all runs on GPU. Interesting to see the codification of training, simulation, and inference at edge. Watching RTX move from gaming to simulation is interesting to say the least.

228

Tomás Puig

Tomás Puig @tomascooking

Jan 5

I really wish NVIDIA would stop using FP4 for every single metric. It’s so hard to guess what it does in real world workloads.

155

DHH

Tomás Puig retweeted

DHH

@dhh

28 Dec 2025

Git worktrees are perfect for starting sandboxes for agents to propose a solution to a problem while you keep working on master or another branch. Here's the bash I use to start a new worktree/branch with "ga fix" (i.e fizzy--fix) and then "gd" after it's done to nuke it again.

133

2,134

188,203