driss guessous

driss guessous

78 Photos and videos

Tweets

driss guessous @drisspg

Jun 10

Agreed

Aaron Gokaslan

@SkyLi0n

Jun 10

Inevitable: nobody wants to allow the use of an LLM that purposefully inserts bugs.

447

driss guessous

driss guessous @drisspg

Jun 9

Holy chart crime

Cursor

@cursor_ai

Jun 9

Claude Fable 5 is now available in Cursor. It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

1,420

234,915

driss guessous

driss guessous @drisspg

Jun 9

Shower thought: what’s the most valuable bit string per bit? On one end you have a windows key, the other end maybe 16tb of mythos weights? No cheating and saying some random string in library of babel. It’s probs a bitcoin wallet though - sad

1,317

driss guessous

driss guessous @drisspg

Jun 6

Where my async proxy kings at; github.com/pytorch/pytorch/i… cc @gaunernst I feel like this one is probably easy for you

[vllm] [2.12 regression][B200] test_batch_invariance: nondeterministic outputs 3/5 trials with...

Summary Under torch 2.12.0 triton 3.7.0, vLLM's test_v1_generation_is_deterministic_across_batch_sizes_with_needle[FLASH_ATTN] fails on B200 because outputs diverge across batch sizes: Failed...

github.com

2,403

driss guessous

driss guessous @drisspg

Jun 5

Ohh you use worktrees, nice! I just use different b200 nodes

3,141

driss guessous

driss guessous @drisspg

Jun 4

Ahhh yes 'uv' one of humanity's greatest scientific achievements

1,690

driss guessous

driss guessous @drisspg

Jun 4

I am trying to make ideogram usable on my spark; Problem 1. github.com/ideogram-oss/ideo… Problem 2. Bitsandbytes is unbelievable slow

Speed up quantized transformer loading by drisspg · Pull Request #9 · ideogram-oss/ideogram4

Summary I was going through the example and on my spark it takes ~2 minutes to get to first image. Using the run_inference.py entry point. Profiling showed 0.0s start 0.5s state dict fi...

github.com

1,923

driss guessous

driss guessous @drisspg

Jun 4

Okay that's enough codex for now-> at least now I can finally look at the generation step

431

driss guessous

driss guessous @drisspg

Jun 2

"The purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise." This really is a nice quote

Modal

@modal

Jun 1

Reinforcement learning has exploded on Modal, and we've been cooking. Here's a review of lessons learned helping teams train at scale, the patterns we kept seeing, and an open-source library to get started with RL on Modal quickly.

0:05

1,770

driss guessous

driss guessous @drisspg

Jun 1

Lol it took 7 hours for it to find GemmUniversal and do some hyper parameter tuning. Can't you feel the AGI!!!

4,465

driss guessous

driss guessous @drisspg

May 24

To hell with big TMA long live ld/st

3,440

Han Guo

driss guessous retweeted

Han Guo

@HanGuo97

May 21

LLM training is built on fast MatMuls. But many surrounding ops still run as memory-bound kernels. CODA reparameterizes them to hide in the matmul’s shadow, fused into its epilogue before results leave the chip. Bonus: LLMs can write fast CODA kernels too (approaching SoLs).

103

685

197,791

driss guessous

driss guessous @drisspg

May 19

Omni looks really cool, everything else is so mehh

639

Degen CPA

driss guessous retweeted

Degen CPA

@DrewVento

May 19

BREAKING: Victor Wembanyama has joined Anthropic.

319

7,089

359,246

driss guessous

driss guessous @drisspg

May 18

slop begets slop

334

driss guessous

driss guessous @drisspg

May 17

If you couldn’t be bothered to write it why in the world would I read it

5,875

driss guessous

driss guessous @drisspg

May 13

Yooooooooo so like what did OAI do with all those sora face scans

1,074

driss guessous

driss guessous @drisspg

May 12

the death of rigor happens one prompt at a time

431

driss guessous

driss guessous @drisspg

May 12

fun problem; given an integer function f and a domain D, which functions preserve ordered contiguity for every consecutive subspan of D?; winner gets 4x faster FlexFlash Attention (for some mask mods)

2,242

driss guessous

driss guessous @drisspg

May 10

Im on a quest to inject Pi into everything; github.com/drisspg/pi-review is this weekends project. Github review is not keeping up with the new asymmetric world. PRs are basically inverted NP -> trivial to produce hard to verify. It is the age of personal software!

GitHub - drisspg/pi-review

Contribute to drisspg/pi-review development by creating an account on GitHub.

github.com

983