Joined December 2023
78 Photos and videos
Agreed
Inevitable: nobody wants to allow the use of an LLM that purposefully inserts bugs.
1
447
Holy chart crime
Claude Fable 5 is now available in Cursor. It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.
44
27
1,420
234,915
Shower thought: what’s the most valuable bit string per bit? On one end you have a windows key, the other end maybe 16tb of mythos weights? No cheating and saying some random string in library of babel. It’s probs a bitcoin wallet though - sad
4
7
1,317
Ohh you use worktrees, nice! I just use different b200 nodes
46
3,141
Ahhh yes 'uv' one of humanity's greatest scientific achievements
1
18
1,690
"The purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise." This really is a nice quote
Jun 1
Reinforcement learning has exploded on Modal, and we've been cooking. Here's a review of lessons learned helping teams train at scale, the patterns we kept seeing, and an open-source library to get started with RL on Modal quickly.
1
14
1,770
Lol it took 7 hours for it to find GemmUniversal and do some hyper parameter tuning. Can't you feel the AGI!!!
3
2
57
4,465
To hell with big TMA long live ld/st
3
4
38
3,440
driss guessous retweeted
LLM training is built on fast MatMuls. But many surrounding ops still run as memory-bound kernels. CODA reparameterizes them to hide in the matmul’s shadow, fused into its epilogue before results leave the chip. Bonus: LLMs can write fast CODA kernels too (approaching SoLs).
15
103
685
197,791
Omni looks really cool, everything else is so mehh
4
639
driss guessous retweeted
BREAKING: Victor Wembanyama has joined Anthropic.
58
319
7,089
359,246
slop begets slop
2
334
If you couldn’t be bothered to write it why in the world would I read it
1
4
66
5,875
Yooooooooo so like what did OAI do with all those sora face scans
9
1,074
the death of rigor happens one prompt at a time
7
431
fun problem; given an integer function f and a domain D, which functions preserve ordered contiguity for every consecutive subspan of D?; winner gets 4x faster FlexFlash Attention (for some mask mods)
2
1
13
2,242
Im on a quest to inject Pi into everything; github.com/drisspg/pi-review is this weekends project. Github review is not keeping up with the new asymmetric world. PRs are basically inverted NP -> trivial to produce hard to verify. It is the age of personal software!
9
983